Axiom, a seven-month-old math startup, solved all 12 Putnam exam problems and achieved 99% on the Verina codegen benchmark by using formal verification (Lean proofs) to strengthen AI training.

Latent SpaceJun 3, 2026

Summaries like this, in your inbox every morning.

3 Key Points

Axiom scored 12/12 on the Putnam exam (a prestigious undergraduate math competition where the median score is typically 0 or 1 points), surpassing the top undergraduates (110/120) and the closest prior AI system result (DeepSeek 103/120).
The startup uses formal verification—converting informal proofs into machine-checkable Lean proofs—during reinforcement learning (RL) to provide stronger reward signals than statistical methods alone. Axiom reported 99% (187/189) on the Verina codegen benchmark, compared with OpenAI o3's 4.9% on the same benchmark.
CEO Carina Hong argues that formal verification enables three compounding benefits: better sample efficiency and maximum performance in training, a high-quality corpus that future inference can build upon, and scaling of proofs (others can learn from and extend them).

AI-summarized, only the topics you pick — one digest a day via Email, Slack, or Discord.

Free · takes 30 seconds · unsubscribe anytime

No discussion yet for this article

Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.

Free · takes 30 seconds · unsubscribe anytime

1 minute a day. The AI essentials.

200+ sources · Email / LINE / Slack