AIToday

Anthropic releases Claude Opus 4.8, calling it a 'modest but tangible improvement' that scores 1,890 points on GDPval-AA and uses 15 percent fewer passes per task than Opus 4.7

THE DECODER5d ago3 min read
Anthropic releases Claude Opus 4.8, calling it a 'modest but tangible improvement' that scores 1,890 points on GDPval-AA and uses 15 percent fewer passes per task than Opus 4.7

Summaries like this, in your inbox every morning.

Sign up free →

3 Key Points

  1. 1

    Claude Opus 4.8 tops benchmarks including agentic coding (SWE-Bench Pro: 69.2 percent, up from 64.3 percent for Opus 4.7) and multidisciplinary reasoning (Humanity's Last Exam: 57.9 percent with tools). On GDPval-AA at 'max' effort level, it scores 1,890 points, 137 points above Opus 4.7.

  2. 2

    The model flags uncertainties about its work more often and makes unsupported claims less frequently than its predecessor. On coding evaluations, it lets bugs slip through without comment about four times less often than Opus 4.7. It also introduces 'dynamic workflows,' which lets the model spin up hundreds of parallel sub-agents in a single session to handle codebase-wide migrations.

  3. 3

    Fast Mode pricing drops to $10 per million input tokens and $50 per million output tokens (down from higher costs for earlier models), while standard pricing remains unchanged at $5 per million input tokens and $25 per million output tokens. On GDPval-AA, Opus 4.8 needs 15 percent fewer passes per task and 35 percent fewer output tokens than Opus 4.7.

  4. 4

    The feature set—dynamic workflows and effort controls—is available on Enterprise, Team, and Max plans. Effort controls let users choose response depth on claude.ai and Cowork, with 'max' mode recommended for tougher tasks.

Discussion

No comments yet. Be the first to share your thoughts!

Log in to join the discussion

Related Articles

Stay ahead with AI news

Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.

Get Started Free

5 minutes a day. The AI essentials.

200+ sources · Email / LINE / Slack

Get it free →