AIToday

Anthropic's Fable 5 returns after two-week ban over jailbreak discovery

THE DECODER2h ago6 min read
Anthropic's Fable 5 returns after two-week ban over jailbreak discovery

Key takeaway

Anthropic's Fable 5 has returned globally after a two-week US government ban triggered by a security discovery. Amazon researchers found a method to bypass the model's safety guardrails, causing it to identify software vulnerabilities and demonstrate exploit code. Though Anthropic deployed a classifier blocking the technique in over 99 percent of cases, the company acknowledges that making AI models fully resistant to jailbreaks is likely impossible, suggesting that security review and government oversight will remain standard practice for powerful AI systems.

Summaries like this, in your inbox every morning.

Sign up free →

3 Key Points

  • What happened

    After a two-week US government investigation, Anthropic has restored Fable 5 worldwide through Claude Platform, Claude.ai, Claude Code, and Claude Cowork. The ban followed a security finding by Amazon researchers who discovered a way to bypass Fable 5's safety guardrails, leading the model to identify software vulnerabilities and produce exploit code. Anthropic deployed an improved safety classifier that blocks the technique in more than 99 percent of cases, though the new classifier also flags more harmless requests during everyday coding work.

  • Why it matters

    The incident reveals the tension between deploying increasingly capable AI models and managing security risks. While Anthropic and the US government found that many less capable models—including Claude Opus 4.8, GPT-5.5, and Kimi K2.7—could spot the same vulnerabilities, the company admits it is 'probably impossible' to make any AI model fully robust to jailbreaks. For businesses relying on frontier AI, this suggests security oversight will remain part of the deployment landscape.

  • What to watch

    Mythos 5, the less-restricted version of the same base model, remains limited to approved US organizations in the so-called Glasswing program. Anthropic is working with the government to expand access to more partners and is building a shared industry standard for rating and responding to jailbreaks with Amazon, Microsoft, Google, and other Glasswing partners. The company has also launched a new HackerOne program for security researchers to report potential cyber jailbreaks.

FAQ

When is Fable 5 available again, and what are the pricing terms?
Fable 5 is back worldwide starting today through Claude Platform, Claude.ai, Claude Code, and Claude Cowork. Pro, Max, Team, and select Enterprise plans include the model through July 7 at up to 50 percent of weekly usage limits; after that, it will be billed through usage credits.
What was the security issue that triggered the ban?
Amazon researchers discovered a way to bypass Fable 5's safety guardrails. Once bypassed, the model identified several software vulnerabilities and in one case produced code showing how to exploit one of them. Anthropic trained an improved safety classifier that blocks that specific technique in more than 99 percent of cases.
Will Mythos 5 become more widely available?
Mythos 5, the less restricted version of the same base model, currently remains limited to a group of US organizations that received government approval on June 26. Anthropic says it is still working with the government to expand access to more partners in the Glasswing program, though whether the EU will join remains unclear.

Discussion

No comments yet. Be the first to share your thoughts!

Log in to join the discussion

Related Articles

Stay ahead with AI news

Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.

Get Started Free

Free · takes 30 seconds · unsubscribe anytime

1 minute a day. The AI essentials.

200+ sources · Email / LINE / Slack

Get it free →