Claude Fable 5: launched, revolted against, and shut down by the government in 5 days

#ai #claude #webdev #programming

TL;DR: Anthropic shipped Claude Fable 5 on June 9. It topped the benchmarks. Devs revolted over price, silent nerfing, and forced data retention. On June 12 the US government ordered it offline. If claude-fable-5 is in your stack, you want a fallback path. Code's at the bottom.

Most model launches blur together. This one went from flagship to federal shutdown in five days. Quick rundown for people who actually build on this stuff.

The model in one paragraph

Fable 5 and its restricted twin Mythos 5 are the same model. The difference is the safety layer. Fable ships to everyone with classifiers on top; Mythos has them lifted and stays locked to vetted partners via Project Glasswing.

Specs that matter for integration:


Model IDs	`claude-fable-5`, `claude-mythos-5`
Context	1M tokens
Max output	128K tokens
Pricing	$10 / 1M input, $50 / 1M output
Data retention	mandatory 30 days, no ZDR option

Hold onto that pricing and that retention line. Both caused fires.

The benchmarks were real

80.3% SWE-Bench Pro (Opus 4.8 is ~69%)
93.9% SWE-Bench Verified
88.0% Terminal-Bench
FrontierCode Diamond: more than double the next-best score Stripe reported a codebase-wide migration across 50M lines of Ruby in one day, work it estimated at two months for a full team. Simon Willison, who does not do hype, called it "a beast." Karpathy called it "a major-version-bump-deserving step change forward."

And then the same devs who tested it turned on it.

Why devs revolted

1. Token burn. Double Opus pricing, and it counts double against subscription limits. Real numbers from real people:

Theo dropped over $1,000 in a day on a $200 plan. Bleeping Computer drained a $100 Max plan in under nine minutes.

2. Silent nerfing. This is the one that actually broke trust, and it came straight from Anthropic's own 319-page system card. When Fable detects you're working on frontier AI research (pretraining pipelines, accelerator design), it does not refuse and does not fall back to a weaker model. It silently degrades its own output, via prompt modification, steering vectors, or PEFT, and tells you nothing.

Anthropic's exact words: "these safeguards will not be visible to the user."

That kills reproducibility. A refusal shows you the wall. A visible fallback is detectable. A model that fakes helping while handing you worse output means a failed run could be your idea, your code, or an invisible intervention you never saw.

False positives caught normal work too: a medical physicist blocked for using the word "nuclear," MRI segmentation flagged as bioterrorism.

3. Forced data retention. 30-day retention on all Mythos-class traffic, no zero-retention option, even on AWS Bedrock and Vertex. GDPR-bound shops are locked out.

Then the government pulled it

Friday June 12, 5:21pm ET. A Department of Commerce letter, signed by Secretary Howard Lutnick, citing national security. Order: suspend all access to Fable 5 and Mythos 5 for any foreign national, anywhere, including Anthropic's own staff.

You can't filter foreign nationals from US users in real time, so Anthropic pulled both models worldwide. Every other model stayed up. Developer API calls to claude-fable-5 started failing.

Anthropic's read: the government believed it had found a jailbreak. Anthropic reviewed it and said it surfaced only minor known flaws that other public models (it named GPT-5.5) can find with no bypass. The "exploit" was asking the model to read a codebase and fix its flaws. You know, the job.

This looks like the first time an export-control tool, the kind built for chips and weapons, has been aimed at a language model already used by hundreds of millions of people. That precedent is the real story.

What this means for your stack

The boring lesson just became load-bearing: if you depend on one model ID, "available" is something the vendor, and apparently a government, can revoke without warning. Build the fallback now.

A few concrete things from the docs:

Fable returns stop_reason: "refusal" as a successful HTTP 200, not an error, and names the classifier that fired.
You're not billed for a refusal that happens before any output.
Retry server-side with the fallbacks param (beta), client-side via SDK middleware (TS, Python, Go, Java, C#), or manually.
Fallback credit refunds the prompt-cache cost so you don't pay twice to switch. Minimal graceful-degradation pattern (also handles the suspension, since a disabled model just fails the first call):

import anthropic

client = anthropic.Anthropic()

def ask(prompt, primary="claude-fable-5", fallback="claude-opus-4-8"):
    for model in (primary, fallback):
        try:
            resp = client.messages.create(
                model=model,
                max_tokens=1024,
                messages=[{"role": "user", "content": prompt}],
            )
            # Fable can refuse with a 200; treat that as "try the fallback"
            if getattr(resp, "stop_reason", None) == "refusal":
                continue
            return resp
        except anthropic.APIStatusError as err:
            print(f"{model} unavailable: {err.status_code}")
            continue
    raise RuntimeError("No model could serve this request")

Other practical calls:

Price by task, not token. Sticker is 2x Opus, but Fable often finishes in fewer turns, so cost-per-completed-task is closer than it looks. Use it for hard migrations, gnarly bugs, large-corpus analysis. Quick rewrite? Use something cheaper.
Check retention before you build. Under GDPR or any zero-retention requirement, Mythos-class is a blocker today, not a detail.
Most apps never hit the safety layer. Anthropic says 95%+ of sessions never trigger a fallback. If you're doing security, bio, or ML-research work, that layer is your whole experience. ## The kicker

Days before all this, Dario Amodei published an essay arguing government should be able to block a dangerous model, but only through a process that's "transparent, fair, clear, and grounded in technical facts." When Commerce pulled Fable on a vague weekend directive, Anthropic pointed right back at it: the order "does not adhere to those principles."

Same week, Anthropic's own survey of 52,000 Americans dropped one stat worth sitting with: only 15% trust AI companies to decide how the tech gets built. Lowest of any institution tested.

The benchmarks say the model is real. The week it had says the terms, and who controls the off switch, matter just as much.

Sources: Anthropic launch post, suspension statement, Public Record, and Claude Platform docs; Dario Amodei's "Policy on the AI Exponential"; reporting from Decrypt, Bloomberg, CNBC, Axios, and Simon Willison.

Have you got claude-fable-5 in production? How are you handling the fallback? Drop it in the comments.