Claude Fable 5 vs Opus 4.8: Is Double the Price Worth It?

#ai #productivity #claudecode #automation

Fable 5 costs 10 dollars per million input tokens against 5 for Opus 4.8, output is 50 vs 25
SWE-Bench Pro gap is 11.1 points, and FrontierCode more than doubles at 29.3 vs 13.4
Same API surface with one new 400: explicit thinking disabled rejects on Fable 5, omit the field
Cache minimum drops to 2048 tokens on Fable 5 while Opus 4.8 needs 4096 to cache at all
My routing: Fable 5 for long autonomous runs and research, Opus 4.8 stays the daily default

Twice the price for the same request shape. That is the Claude Fable 5 deal on paper, and it makes the model easy to dismiss as an expensive trophy. I have been running Fable 5 and Opus 4.8 side by side since the June 9 launch, and the split is cleaner than I expected. This is the comparison I wanted on day one: numbers, code differences, and the routing rules I actually adopted.

The Score: Benchmarks Side by Side

Start with what doubled and what did not.

Benchmark
Fable 5
Opus 4.8

SWE-Bench Pro (real GitHub tasks)
80.3%
69.2%

FrontierCode (production-standard)
29.3%
13.4%

Core Analytics
first past 90%
10 points lower

Everyday spreadsheet suite
wins at every effort level
25-30% slower

The pattern matters more than any single row. On saturated work, the gap is small: Anthropic's own ViBench numbers show standard vibe-coding requests nearly maxed out, and Opus 4.8 already one-shots those. On hard work, the gap explodes. FrontierCode is the telling row, because 13.4 to 29.3 is not an increment, it is a different tier of task becoming solvable.

Two non-coding results moved me. On long Slay the Spire runs, persistent memory helped Fable 5 three times more than it helped Opus 4.8, and it reached the final act three times more often. That is the long-horizon coherence story in miniature: the model gets more out of its own notes. And on the Hebbia finance benchmark plus document reasoning, Fable 5 posted the highest scores measured, which matches what I see on dense PDF extraction. FrontierBench, built specifically for long-horizon reasoning, completes the sweep with Fable 5 on top.

For the baseline this is being measured against, I broke down the Opus 4.8 release in Claude Opus 4.8 Is Here: Everything That Changed. Everything that article praised still holds. Fable 5 does not make Opus 4.8 worse, it makes the hardest 20 percent of my queue possible.

Same Surface, Three Gotchas in Code

Migrating is one line. The model ID is claude-fable-5, and Fable 5 keeps the exact request surface of Opus 4.7 and 4.8: adaptive thinking via thinking: {"type": "adaptive"}, no temperature, top_p, or top_k (all return 400), no budget_tokens, no final-assistant-turn prefills, effort levels from low through xhigh up to max.

Three differences will bite you if you skip the changelog.

First, the new 400. An explicit thinking: {"type": "disabled"} is rejected on Fable 5. Opus 4.8 accepts it, Fable 5 does not. If you want no thinking, omit the field entirely. Audit any shared request builder that sets disabled as a default, because it will pass on Opus and fail on Fable.

Second, prompt caching minimums moved in Fable's favor. Fable 5 caches prefixes from 2048 tokens, while Opus 4.8 needs 4096. A 3,000-token system prompt silently never caches on Opus 4.8 and caches fine on Fable 5. At 10 dollars per million input tokens, cache reads at roughly a tenth of base price are not optional, so this quietly softens the price gap for short-prompt workloads.

Third, fast mode in Claude Code stays Opus-only. If your muscle memory is fast mode for quick iterations, that lever does not exist on Fable 5 yet.

Context window and output ceiling are identical: 1 million tokens in, up to 128K out with streaming. Compaction, structured outputs, and task budgets all carry over.

One tuning note that saves money on day one: effort matters more on this tier, not less. I run xhigh for agentic coding (it is the Claude Code default for good reason) and drop to high for everything else. The instinct to crank a premium model to max everywhere is wrong, because max can overthink routine tasks, and on Fable 5 you pay 50 dollars per million for every token of that overthinking. Sweep medium, high, and xhigh on your own workload before you settle.

Where the 2x Pays For Itself

Price per token is the wrong frame for agentic work. Price per solved task is the right one, and that math flips fast.

A concrete example from my week. A refactor that Opus 4.8 needed three attempts to land (two failed runs, one clean) burned roughly 600K output tokens across the attempts, about 15 dollars. Fable 5 one-shotted an equivalent task in 130K output tokens, about 6.50. The expensive model was the cheap one, because retries are where agentic budgets actually go. Stripe's launch-day example is the same effect at scale: a 50-million-line Ruby migration scoped at months of engineering, compressed into a day.

So the first routing rule: anything long-horizon and autonomous goes to Fable 5. Overnight runs, multi-hour refactors, tasks where I write the spec once and walk away. The Slay the Spire memory result is the lab version of what I see in practice, it stays coherent deeper into a run.

Second: research-shaped work. In blinded comparisons its molecular biology hypotheses were preferred about 80 percent of the time over Opus-class output, and internal drug design work ran about 10 times faster. My version of that is competitive teardowns and dense document analysis, and the quality delta there is obvious in a way routine coding is not.

Third: vision-heavy tasks. Rebuilding app source from screenshots and reading exact values off chart images both worked in my tests, and the benchmark claims back it.

If the task is hard enough that failure means rerunning it, Fable 5 is usually the budget choice.

Where Opus 4.8 Stays My Default

Most of my queue did not move, and that is the honest half of this review.

Bulk and routine work stays on Opus 4.8. My daily auto-publish pipeline, commit summaries, content drafts, anything where the old model already one-shots reliably. Paying double for a task with a 100 percent first-pass rate buys nothing. The wider routing logic across Opus, Sonnet, and Haiku has not changed since I wrote Opus 4.8 vs Sonnet vs Haiku: How I Route Work in 2026, Fable 5 just added a tier above it.

Security-adjacent work is the funny one. Fable 5 routes cybersecurity requests (vulnerability discovery, exploit analysis, attack planning) to Opus 4.8 through its safeguard classifiers anyway. The classifiers trigger in under 5 percent of sessions overall, but security work lives inside that 5 percent by definition. If your workload is security review, you would pay Fable prices for Opus answers. Skip the middleman and run Opus 4.8 directly there. And if you are unsure whether your traffic trips the handoff, check the model field on the responses you get back, because the swap is otherwise silent.

Latency-sensitive iteration also stays put, because fast mode only exists on Opus. For tight edit-test loops in Claude Code, Opus 4.8 with fast mode still feels better than Fable 5 at standard speed, and for what day-to-day Opus work looks like, Opus 4.8 in Practice: A Real Coding Session still describes my normal day.

Last one: if your prompts are short and your tasks are routine, the spreadsheet-suite speed result (Fable finishing 25 to 30 percent faster) is not enough reason to switch. Speed on easy tasks was never the bottleneck.

Bottom Line

The 2x price is real and so is the tier. My rules after running both: hard, long, autonomous, or research-shaped goes to Fable 5. Routine, bulk, latency-bound, or security-related stays on Opus 4.8. Count retries, not tokens, when you compare costs, because one avoided rerun pays the entire premium.

Paid Claude plans include Fable 5 at no extra cost until June 22, so the test costs you nothing for two weeks. Throw your hardest real task at it, not a toy prompt, and watch whether the retry count drops. For the launch details, safeguards, and availability timeline, the full story is in Claude Fable 5 Is Here: The First Public Mythos-Class Model.

My whole dual-model setup, routing hooks included, ships as Claude Blueprint, and the rest of my Claude coverage lives at the Claude Code topic hub.