Build an Aligned AI Future

5050 is a program that helps great scientists, researchers, and engineers become great founders. It’s helped launch 78 companies, and now we’re turning our attention to one of the most important challenges of our time: building a safe and aligned AI future.

We created the 5050 AI track to support founders building for this new world. Below are five ideas we’d love to back right now. We believe these are great opportunities to build safe AI, but they’re not the only ones. If you’re building in this space, we want to hear from you.

Applications are open until September 20.

Apply to the 5050 AI track here: https://www.fiftyyears.com/5050/ai 

Recommend someone to the 5050 AI track: https://50y.typeform.com/ai-recommend 

Get in touch with us about the requests for startups: gustavs@50y.com 

Mission Control for Agents

Scalable Oversight for Multi-Agent Systems

The new training paradigm is shifting toward multi-agent systems, but reliability at scale remains a critical bottleneck. Managing context effectively across agents, monitoring agent behavior, and controlling computational costs are all fundamentally scalable oversight problems that no single lab has fully solved.

The Problem

An independent startup focused on scalable oversight could build the infrastructure and tooling needed to make multi-agent systems production-ready. Beyond selling tools, this company would invest in cutting-edge research to push the boundaries of what's possible in agent coordination and monitoring. The combination of infrastructure and research would create defensible advantages while accelerating the field's progress.

The Opportunity

A first step could be to pilot a simple oversight loop across heterogeneous model families and tasks. This might combine online self-play with judge models in an attacker-defender-judge setup. Trials could include Qwen-2.5/Qwen-3, Llama, and Mixtral on multi-turn, tool-use, and domain datasets in biomedicine, cybersecurity, and finance. Capability-preserving supervised fine-tuning could be used to improve safety while limiting undue refusal on benign inputs.

The First Move

Hugging Face for Mech-Interp

Model Observability At Scale

Golden Gate Claude demonstrated the transformative potential of understanding and manipulating model activations. As inference costs plummet and interpretability techniques mature, we're approaching a world where custom model behaviors through activation engineering become practical at scale. With algorithmic improvements providing strong tailwinds, a company could make interpretability as accessible as fine-tuning is today.

The Problem

An independent startup could democratize mechanistic interpretability by amortizing training costs across developers and providing easy-to-use infrastructure for activation engineering. The opportunity is particularly compelling in regulated industries like financial services, healthcare, utilities, and defense. The business model could combine enterprise subscriptions for custom activation development with inference endpoints for production deployment.

The Opportunity

The First Move

Making this real could rest on three pillars: feature discovery, causal attribution, and reliable control. Feature discovery might surface sparse, human-meaningful units from polysemantic activations. Attribution could estimate which features most influence outputs for a task. Control would inject targeted edits with predictable magnitude and scope. Together, these could form an activation stack analogous to pretraining, evaluation, and fine-tuning.

Fort Knox for Models

Classified-Level Data Center Security

As AI systems approach and exceed human capabilities in critical domains, the security requirements for development infrastructure are approaching those of classified government facilities. Research labs need protection against both nation-state actors seeking to steal intellectual property and bad actors attempting to cause misuse incidents.

The Problem

The Opportunity

We see an opportunity for a startup to deliver infrastructure that meets or exceeds RAND Level 5, as well as to provide a straightforward on-ramp to RAND Levels 3 and 4. The aim is to protect against national security risks while enabling top talent to work on frontier models without productivity penalties. As theft and misuse risks become more salient to regulators and the public, world-class security will become table stakes for AI development.

The First Move

A first step could be a hosted model-weights service that keeps model files inside secure hardware and allows interaction through policy-checked APIs, focusing on risk reduction per unit of researcher time. The platform could run signed code, keep cryptographic keys inside secure hardware, and prove that the intended software is running end to end, so interfaces return answers, not the model. AI agents could watch code and data flows in real time, block suspicious actions, and conduct continuous red-teaming without affecting productivity.

Space Airlock for Models

Structured Access to Models

The Problem

As AI models become more powerful, research labs benefit from external researchers conducting evaluations, alignment research, interpretability work, and red-teaming, but providing model access creates substantial risks around misuse and intellectual property theft through reverse-engineering.

The Opportunity

We see an opportunity for a startup to provide infrastructure allowing labs to grant granular access to their models. This platform would enable labs to be more transparent and accountable while producing public goods research without compromising safety or commercial interests. The business would amortize infrastructure costs across multiple labs and could expand into providing evaluation and red-teaming services directly.

The model could stay inside a secure boundary, with plain-text responses by default rather than raw internals like logits. Where research clearly benefits, vetted users could see extra signals and have controlled sampling options. Unique watermarks per tenant, rate and burst limits, and small randomized variations could help deter and detect copy-training, while confidential compute and process isolation keep weights and other secrets out of reach.

The First Move

Psychic Shield for Models

Evals for Mental Health

As AI systems become more widely used, they increasingly interact with people in vulnerable or distressed states. Certain conversational patterns, such as excessive agreeableness, can unintentionally reinforce harmful behaviors. As model capabilities grow, these risks also become more subtle and context-dependent.

The Problem

An independent startup could develop rigorous evaluations focused on mental health-relevant interactions for labs. The company would develop scenario libraries, risk frameworks, and continuous monitoring that labs and platforms can run before and after release, coupled with mitigation guidance that preserves helpfulness. The business model could combine sales of evaluations with consulting services.

The Opportunity

The First Move

A first step could be to seed scenarios from real-world multi-turn interactions and expand them to cover diverse conversations. Scoring could combine calibrated human raters with model-based classifiers tuned to signals like excessive agreeableness, unwarranted clinical claims, weak uncertainty expression, and missed referrals. An automated scenario generator could produce varied and adversarial permutations to improve edge-case performance.

Building in this space?

Apply to the 5050 AI track here: https://www.fiftyyears.com/5050/ai 

Recommend someone to the 5050 AI track: https://50y.typeform.com/ai-recommend 

Get in touch with us about the requests for startups: gustavs@50y.com