
Anthropic shipped Claude Fable 5 on June 9, 2026 — its first generally available Mythos-class model, priced at $10 per million input tokens and $50 per million output. That is exactly double Claude Opus 4.8, and the benchmark deltas are real: SWE-Bench Pro 80.3% vs 69.2%, FrontierCode 29.3% vs 13.4%.
But the price is not the migration story. The API behavior is. Fable 5 ships three breaking changes that will silently misbehave in any integration that assumes Opus-era semantics. This post covers what actually changes in your code, what the bill looks like, and where the traps are.
I run model intelligence at TokenMix, where we track pricing and API behavior across 300+ models. Everything below is sourced from Anthropic's launch docs, migration guide, and pricing page — verified June 10, 2026.
The 60-second version
- Price: $10/$50 per MTok. Every rate is exactly 2× Opus 4.8 — cache reads $1, 5-min cache writes $12.50, 1-hour writes $20, batch $5/$25.
- Specs: 1M context, 128K max output, no long-context surcharge.
-
Model ID:
claude-fable-5on the Claude API;anthropic.claude-fable-5on Bedrock;anthropic/claude-fable-5on OpenRouter. -
Breaking change 1: Adaptive thinking is always on.
thinking: {"type": "disabled"}returns an error. -
Breaking change 2: Refusals are HTTP 200 responses with
stop_reason: "refusal"— not error codes. - Breaking change 3: Safety classifiers reroute flagged requests to Opus 4.8 (under 5% of sessions), and rerouted requests bill at Opus rates.
- No ZDR: 30-day data retention is mandatory. Zero-data-retention accounts don't see the model at all.
Breaking change 1: thinking is no longer optional
On Opus 4.8 you could disable thinking to trade quality for latency. On Fable 5 you cannot — adaptive thinking is permanently on, and the model decides how much to think per request.
Your replacement lever is the effort parameter:
{
"model": "claude-fable-5",
"max_tokens": 16000,
"effort": "high",
"messages": [...]
}
Five levels: low, medium, high, xhigh, max. Default is high. Anthropic's migration guide is explicit: start at high even for workloads that ran xhigh on Opus 4.8 — Fable 5 reaches further per unit of thinking.
Two gotchas:
-
max_tokensnow caps thinking + response combined. A workload that ran thinking-off on Opus 4.8 inherits always-on thinking here. Output budgets sized for bare responses will truncate. Resize them. -
Raw chain-of-thought is never returned.
thinking.displaydefaults to"omitted"; set it to"summarized"if you want readable summaries. In multi-turn conversations, pass thinking blocks back unchanged.
Prefill, manual thinking budgets, and sampling parameters are still rejected with 400 — unchanged from Opus 4.7/4.8, so nothing new breaks there.
Breaking change 2: refusals look like success
This is the integration trap. A refused request returns HTTP 200 with:
{
"stop_reason": "refusal",
"stop_details": { "category": "cyber" }
}
stop_details.category is one of "cyber", "bio", "reasoning_extraction", or null. Anything keyed on HTTP status codes treats this as a normal completion and passes a declined response downstream. Check stop_reason on every Fable 5 response.
Billing on refusals:
- Refused before any output → $0
- Classifier fires mid-stream → input plus already-streamed output is billed; discard the partial output
Breaking change 3: the Opus 4.8 fallback
Fable 5 is the same underlying model as Claude Mythos 5 (the Glasswing-partners-only variant) with safety classifiers active. When a classifier flags a request — offensive cyber, bioweapon-adjacent biology, or distillation-style extraction patterns — the response is served by Opus 4.8 instead, and bills at Opus rates ($5/$25).
Anthropic reports under 5% of sessions trigger this. The beta fallbacks parameter automates retry server-side, but only on the Claude API and Claude Platform on AWS. On the Batch API, Bedrock, Vertex, and Foundry, retries run client-side via SDK middleware (TypeScript, Python, Go, Java, C#).
One pattern worth flagging from the Claude Code docs: fallback can fire on the first request of a session, before you type anything, because that request carries workspace context — CLAUDE.md content, directory names, git status. A repo full of security tooling can trip the classifier on context alone. claude --safe-mode strips customizations to diagnose it.
And the false-positive reports are already in: the Hacker News launch thread has developers reporting MRI brain-segmentation code and mosquito-malaria research flagged as bio risks. If your domain is health-adjacent, meter your first week.
The pricing table that matters
| Rate | Fable 5 | Opus 4.8 | Multiple |
|---|---|---|---|
| Base input | $10.00 | $5.00 | 2.0× |
| 5-min cache write | $12.50 | $6.25 | 2.0× |
| 1-hour cache write | $20.00 | $10.00 | 2.0× |
| Cache read | $1.00 | $0.50 | 2.0× |
| Output | $50.00 | $25.00 | 2.0× |
| Batch input | $5.00 | $2.50 | 2.0× |
| Batch output | $25.00 | $12.50 | 2.0× |
| Min cacheable prompt | 512 tokens | 1,024 tokens | Fable caches shorter prompts |
Three footnotes that change real bills:
- No long-context surcharge. Per Anthropic's pricing docs, "a 900k-token request is billed at the same per-token rate as a 9k-token request." Gemini 3.1 Pro doubles its input rate past 200K; Fable 5 doesn't.
- Tokenizer. Fable 5 uses the Opus 4.7 tokenizer — roughly 30% (up to 35%) more tokens from the same text vs pre-4.7 models. Comparisons against Opus 4.8 are apples-to-apples; against your old 4.5-era bills, they are not.
- No fast mode. Opus 4.8 fast mode costs the same $10/$50 as Fable 5 — the same sticker price buys speed or intelligence, pick one.
Is 2× worth it? The cost-per-solve math
Raw per-attempt cost on a 100K-in / 20K-out agentic task: Fable $2.00, Opus $1.00. Now divide by published pass rates:
| Difficulty tier | Fable 5 | Opus 4.8 | GPT-5.5 |
|---|---|---|---|
| SWE-Bench Pro tier (routine-hard) | $2.49 | $1.45 | $1.88 |
| FrontierCode tier (frontier-hard) | $6.83 | $7.46 | $19.30 |
On routine work, Opus 4.8 wins per solved task. On frontier-hard work, Opus fails often enough that retries eat the savings and Fable becomes the cheapest per solve. Route by task difficulty, not by loyalty to a price point.
Field reports from the HN thread cut both ways: several developers report Fable finishing in fewer turns with "more targeted and surgical diffs" — one claims comparable results with about half the tokens, which would put effective cost near Opus parity. Another metered $82.92 in API-equivalent usage in a single day on a Max plan. The variance is the takeaway.
Migration checklist
- Swap model ID to
claude-fable-5(or run/claude-api migratein Claude Code — it automates the parameter changes too). - Remove any
thinking: {"type": "disabled"}— it errors now. - Resize
max_tokensfor thinking + response combined. - Add a
stop_reason === "refusal"check; readstop_details.category. - Decide your fallback story:
fallbacksparam (Claude API / AWS) or SDK middleware (everywhere else). - Audit for ZDR conflicts — Covered Model status means mandatory 30-day retention, no workaround.
- Set
effort: "high"and only escalate toxhigh/maxwith eval evidence.
FAQ
Can I disable thinking on Claude Fable 5?
No. Adaptive thinking is permanently on and thinking: {"type": "disabled"} returns an error. Use the effort parameter (low through max) to control thinking depth, and remember max_tokens caps thinking plus response combined.
What does stop_reason: "refusal" mean?
A safety classifier declined the request — it is a successful HTTP 200 response, not an error. stop_details.category names the classifier: "cyber", "bio", "reasoning_extraction", or null. Refusals with no output are free.
Does Claude Fable 5 work in Claude Code?
Yes — /model fable on v2.1.170+. It is never the default, and it is hidden entirely under zero-data-retention accounts. Flagged requests re-run on Opus 4.8 with a transcript notice.
Is Fable 5 on Bedrock and Vertex?
Yes, GA since June 9: anthropic.claude-fable-5 on Bedrock (global. prefix on the global endpoint; the cache minimum stays 1,024 tokens there), claude-fable-5 on Vertex AI and Microsoft Foundry. OpenRouter lists it at pass-through $10/$50. Note the fallbacks parameter is not available on Bedrock/Vertex/Foundry — use SDK middleware.
Should I migrate everything from Opus 4.8?
No. The cost-per-solve math says route the frontier-hard 10-20% of your workload to Fable 5 and keep routine traffic on Opus 4.8 or Sonnet 4.6. Fable loses on routine-task economics, interactive latency, and ZDR compliance.
Full review with benchmark tables, the Mythos 5 / Project Glasswing context, and the monthly-bill math: Claude Fable 5 Review 2026: Pricing, Benchmarks, vs Opus 4.8
Top comments (0)