AIToday

Axiom, a seven-month-old math startup, solved all 12 Putnam exam problems and achieved 99% on the Verina codegen benchmark by using formal verification (Lean proofs) to strengthen AI training.

Latent Space6h ago1 min read
Axiom, a seven-month-old math startup, solved all 12 Putnam exam problems and achieved 99% on the Verina codegen benchmark by using formal verification (Lean proofs) to strengthen AI training.

Summaries like this, in your inbox every morning.

Sign up free →

3 Key Points

  1. 1

    Axiom scored 12/12 on the Putnam exam (a prestigious undergraduate math competition where the median score is typically 0 or 1 points), surpassing the top undergraduates (110/120) and the closest prior AI system result (DeepSeek 103/120).

  2. 2

    The startup uses formal verification—converting informal proofs into machine-checkable Lean proofs—during reinforcement learning (RL) to provide stronger reward signals than statistical methods alone. Axiom reported 99% (187/189) on the Verina codegen benchmark, compared with OpenAI o3's 4.9% on the same benchmark.

  3. 3

    CEO Carina Hong argues that formal verification enables three compounding benefits: better sample efficiency and maximum performance in training, a high-quality corpus that future inference can build upon, and scaling of proofs (others can learn from and extend them).

Discussion

No discussion yet for this article

Stay ahead with AI news

Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.

Get Started Free

Free · takes 30 seconds · unsubscribe anytime

5 minutes a day. The AI essentials.

200+ sources · Email / LINE / Slack

Get it free →