
Summaries like this, in your inbox every morning.
Sign up free →Claude Opus 4.8 tops benchmarks including agentic coding (SWE-Bench Pro: 69.2 percent, up from 64.3 percent for Opus 4.7) and multidisciplinary reasoning (Humanity's Last Exam: 57.9 percent with tools). On GDPval-AA at 'max' effort level, it scores 1,890 points, 137 points above Opus 4.7.
The model flags uncertainties about its work more often and makes unsupported claims less frequently than its predecessor. On coding evaluations, it lets bugs slip through without comment about four times less often than Opus 4.7. It also introduces 'dynamic workflows,' which lets the model spin up hundreds of parallel sub-agents in a single session to handle codebase-wide migrations.
Fast Mode pricing drops to $10 per million input tokens and $50 per million output tokens (down from higher costs for earlier models), while standard pricing remains unchanged at $5 per million input tokens and $25 per million output tokens. On GDPval-AA, Opus 4.8 needs 15 percent fewer passes per task and 35 percent fewer output tokens than Opus 4.7.
The feature set—dynamic workflows and effort controls—is available on Enterprise, Team, and Max plans. Effort controls let users choose response depth on claude.ai and Cowork, with 'max' mode recommended for tougher tasks.
No comments yet. Be the first to share your thoughts!
Log in to join the discussion



Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.
Get Started Free5 minutes a day. The AI essentials.
200+ sources · Email / LINE / Slack