Anthropic launches Claude Fable 5, a safeguarded Mythos-class model
1 min read
Originally from tiktok.com
View source
My notes
Watch on TikTok Tap to open video
Summary
Anthropic launched Claude Fable 5, calling it their most capable publicly released model - a “Mythos-class” model with added safeguards. Its predecessor, Claude Mythos Preview, was withheld from public release because it could find thousands of cybersecurity vulnerabilities; it was instead given to defenders of critical software. Fable 5 ships with request-screening safeguards and is positioned for long-running autonomous work beyond coding.
Key Insight
- Anthropic withheld a frontier model (Claude Mythos Preview) entirely because dual-use risk was too high: a model that finds thousands of security holes can also exploit them. They routed it to defensive use only - patching critical software before public exposure.
- Fable 5’s safety mechanism is architectural, not just training-based: requests touching high-risk areas (cybersecurity, biology) are automatically detected and redirected to a less capable model (Opus 4.8). Capability tiering per-request, not per-product.
- Anthropic admits the safeguards are deliberately over-broad at launch and will be loosened over time - expect false-positive refusals/redirects in security and bio domains early on.
- Headline capability claims: stays on a problem longer than any prior model, “highly autonomous”, can operate for days without intervention.
- Positioning shift: explicitly marketed beyond coding - finance, research, economics, law, and other tasks that previously needed constant supervision. Signals the agent market moving from coding assistants to general knowledge-work delegation.