Two DeepMind Scientists Who Co-Created AlphaGo <em>Raised $2.13B to Build Autonomous Coders</em>

The DeepMind Exit

Misha Laskin led reward modeling for Gemini at DeepMind - the system that teaches Google's most advanced AI to improve at tasks through reinforcement learning. Ioannis Antonoglou co-created AlphaGo, the AI that beat Lee Sedol at Go and changed how the world thinks about machine intelligence.

They left DeepMind to start Reflection AI with one thesis: reinforcement learning + large language models = software engineers that improve themselves.

What Reflection AI Actually Does

They're building "superhuman general agents" that autonomously write, debug, test, and deploy code. The key differentiator is reinforcement learning - the same technology behind AlphaGo - applied to coding. The agents don't just predict what code to write; they learn from outcomes and get better.

$2.13B

Total Raised

Sequoia

Investor

NVIDIA

Investor

AlphaGo

Co-creator

Why This is Different

Most coding AI tools use LLMs to predict the next token. Reflection uses reinforcement learning to optimize for working code. The difference: an LLM generates code that looks right. An RL agent generates code that works.

When the people who made AI beat world champions at Go apply the same approach to software engineering, the implication is clear: coding agents will get better with every deployment, just as AlphaGo got better with every game.

The co-creator of AlphaGo thinks reinforcement learning is the key to autonomous software engineering. $2.13B in funding suggests investors agree.

The Breakout Pattern

DeepMind is where the world's best AI researchers go. When they leave to build a company, it means they've seen something inside the lab that the market hasn't priced in yet.

That's the breakout.

Your domain expertise is the moat.

Explore 50 startup ideas for engineers who refuse to compete for shrinking jobs.

Browse 50 Ideas

Two DeepMind Scientists Who Co-Created AlphaGo Raised $2.13B to Build Autonomous Coders

The Founders

The DeepMind Exit

What Reflection AI Actually Does

Why This is Different

The Breakout Pattern

Your domain expertise is the moat.