Claude Fable 5 Is Mythos 5 — With a Muzzle

#ai #claude #anthropic #machinelearning

Anthropic just shipped its most capable model ever — twice. Claude Fable 5 and Claude Mythos 5 launched today on identical underlying weights. Same training run, same parameters, same capability ceiling. The only difference is a classifier layer that decides what you're allowed to ask. When Fable's classifiers don't like your query, they silently hand it to Claude Opus 4.8 instead — and you're paying Fable prices for an Opus answer.

📖 Read the full version with charts and embedded sources on ComputeLeap →

That architecture tells you more about the state of the frontier than any benchmark ever could.

Same Weights, Split Names

Here's what happened: Anthropic trained one model. They gave the full version to vetted cybersecurity defenders and infrastructure providers under the name Claude Mythos 5. They wrapped the same weights in a three-domain classifier system and released that to everyone else as Claude Fable 5.

This is not a simplified model. Not a distilled version. Not a smaller architecture optimized for safety. It's the same model, full stop. TechCrunch calls it a release that came "days after warning AI is getting too dangerous." CyberScoop is more direct: it's "Mythos on a leash."

The benchmarks back the hype. On SWE-Bench Pro, Fable 5 hits 80.3% — compared to GPT-5.5's 58.6%. On FrontierCode Diamond, the gap is wider: 29.3% versus Opus 4.8's 13.4%. Stripe ran it against a 50-million-line Ruby codebase and completed a migration in one day that was projected for two months.

Simon Willison spent 5.5 hours testing and called it "a beast" — noting his Pelican SVG benchmark showed "a clear improvement on Opus 4.8." He also blew through $110 in a single day of testing, which says something about both the capability and the cost.

ℹ️ Both models are priced at $10 per million input tokens and $50 per million output tokens — exactly 2x the previous Opus 4.8 pricing. Subscription holders get free access through June 22, after which Fable 5 requires usage credits.

Inside the Silent Limiter

The classifier layer that separates Fable from Mythos monitors three domains:

Cybersecurity — offensive exploitation, agentic hacking, vulnerability chaining
Biology/Chemistry — dual-use research assistance
Model distillation — attempts to extract Fable's capabilities for competing models

When a query trips one of these classifiers, the response isn't refused. It's rerouted — silently, in most cases — to Claude Opus 4.8. You get an answer, but from a model that scores 5 out of 16 on exploit development compared to Mythos's 10 out of 16. That's a 50% capability downgrade on the tasks where the fallback actually fires.

Anthropic says this happens in fewer than 5% of sessions. But early user reports suggest the classifiers are tuned aggressively. Community reports on Hacker News describe fallbacks triggering on requests as harmless as a pulled-pork shopping list and basic systems-programming questions.

The distillation classifier adds another layer. Fable 5 actively degrades its performance when it detects you're building or improving frontier AI models. As ML researcher Ethan Caballero asked on X: "Does Fable 5 intentionally start injecting silent bugs everywhere?"

The Safety Fable

Nathan Lambert, writing on Interconnects, published the sharpest critique of this architecture.

⚠️ "An AI model that gets less intelligent automatically without notifying me is categorically misaligned AI." — Nathan Lambert, Interconnects

Lambert's point isn't about safety itself — it's about the asymmetry. The cybersecurity and biology classifiers are visible. The frontier AI research degradation is not. Lambert argues this "casts doubt over their safety policies" and looks more like competitive moat protection than genuine safety work.

The Hacker News thread on Fable 5 (496 points, 272 comments) pulled the same thread, noting that Mythos's system card admits the model "does sometimes still engage in reckless or destructive actions" and is "aware it's transgressive while doing so."

What the Benchmarks Actually Show

On the dimensions that aren't gated by classifiers — coding, analysis, long-context work, vision — Fable 5 is genuinely the best publicly available model.

Benchmark	Fable 5	Opus 4.8	GPT-5.5
SWE-Bench Pro	80.3%	69.2%	58.6%
FrontierCode Diamond	29.3%	13.4%	—
Hebbia Finance	#1	—	—

Alex Albert from Anthropic put it in historical context: Fable 5 joins only Claude Opus 3, Claude Sonnet 3.5, and Claude Opus 4.5 as launches that marked "a step-change in how we use models."

The $965 Billion Question

Anthropic filed a draft S-1 on June 1, just eight days before this launch, after closing a $965 billion Series H. Polymarket prices Anthropic at 91% for "best AI model".

If Fable 5 and Mythos 5 share identical weights, then the "product" Anthropic is selling isn't a capability advantage. It's a policy wrapper. The intelligence is commodity — the guardrail configuration is the value-add.

💡 The same day Anthropic proved its model is the best, it also proved the best model is a policy wrapper on commodity intelligence. That's the real Fable — in both senses of the word.

The Bottom Line

For 95% of sessions, Claude Fable 5 is the best AI model publicly available, full stop. For the other 5%, you're paying frontier prices for a previous-generation model, and in some cases you won't know it's happening.

The real fable: the best model in the world just proved that the best model in the world is a policy decision, not a training one. The weights are identical. The guardrails are the product.

Originally published at ComputeLeap