Mudassir Khan

Posted on Jun 10 • Originally published at mudassirkhan.me

Claude Fable 5: Features, Pricing, and Fallbacks

#api #ai #webdev #programming

Claude Fable 5: Features, Pricing, and Fallbacks

Anthropic just shipped claude-fable-5, its most capable widely released model. It is a Mythos class model made safe for general use, with a 1M token context window, adaptive thinking, and a safety classifier that quietly hands risky prompts to Claude Opus 4.8. Here is what actually matters when you wire it into a product.

Fable 5 and Mythos 5 are the same model

The launch came in two names. claude-fable-5 and claude-mythos-5 are the same underlying model. The only difference is the safeguards.

Mythos 5 runs without the safety classifiers and is limited to a small group of cyber defenders through Project Glasswing. Fable 5 is the version everyone gets, on the Claude API, Amazon Bedrock, Vertex AI, and Microsoft Foundry. A Mythos class model sits a tier above the Opus class in raw capability, so when your request is not flagged, you are talking to the strongest model Anthropic has put in general hands. Their own data says more than 95% of Fable sessions never hit a fallback.

The specs you plan around

Spec	Claude Fable 5	Claude Opus 4.8
API id	`claude-fable-5`	`claude-opus-4-8`
Context window	1M tokens	up to 1M tokens
Max output	128k tokens	lower
Input price / M tokens	$10	$5
Output price / M tokens	$50	$25
Thinking mode	adaptive only	configurable

Fable 5 costs twice as much per token as Opus 4.8. That premium buys the strongest reasoning and the longest autonomous runs, but it means routing matters. Send the hard, long horizon tasks to Fable and keep routine work on Opus or Sonnet. The 1M context is on by default, and the 128k output ceiling is enough to return whole files or a long migration in one response.

Where it actually leads

The headline is long horizon autonomy. The longer the task, the bigger Fable 5's lead over earlier models.

Software engineering. Stripe ran a codebase wide migration across 50 million lines of Ruby in a single day. Fable 5 also tops Cognition's FrontierCode eval, even at medium effort.
Knowledge work. Highest score of any model on Hebbia's finance benchmark, with real gains on document reasoning and chart and table interpretation.
Vision. New state of the art. It reads precise numbers off scientific figures and rebuilds a web app's source from screenshots alone.
Memory. It stays focused across millions of tokens and improves using notes it writes to a file based memory tool.

The safety classifier is the thing to design for

This is the part that will surprise you in production. A separate classifier model sits in front of Fable 5. When a prompt looks like it touches cybersecurity, biology, chemistry, or distillation, Fable does not answer. The request falls back to Claude Opus 4.8, and the user is told it happened.

The classifiers are tuned conservatively, so they sometimes catch harmless prompts. Anthropic says the fallback fires in under 5% of sessions on average. For builders, the move is to expect the occasional fallback on benign cyber or bio adjacent prompts and handle it gracefully instead of treating it as an error.

What changes on the API

The biggest one is refusals. When Fable 5 declines, the Messages API does not throw. You get a successful HTTP 200 with stop_reason set to refusal, and it tells you which classifier declined. You are not billed for a request refused before any output is generated.

// A refused request comes back as a normal 200 response
{
  "type": "message",
  "role": "assistant",
  "stop_reason": "refusal",
  "content": [],
  "usage": { "input_tokens": 412, "output_tokens": 0 }
}

A refused request can usually be served by another model. Pass the fallbacks parameter and the API retries for you, or use the SDK middleware to retry from the client. Fallback credit refunds the prompt cache cost so you do not pay it twice.

{
  "model": "claude-fable-5",
  "fallbacks": ["claude-opus-4-8"],
  "messages": [ ]
}

Three more behaviors that are specific to Fable 5 and Mythos 5:

Adaptive thinking is always on. It is the only thinking mode. You cannot disable thinking. Use the effort parameter to control depth and spend.
Raw thinking is never returned. It is omitted by default. Set thinking display to summarized for readable summaries, and pass thinking blocks back unchanged in multiturn conversations on the same model.
30 day data retention. Both models are Covered Models, so zero data retention is not available. The data is used for safety, not training.

Pricing and availability

Pricing is $10 per million input tokens and $50 per million output tokens for both models. On the Claude API and consumption based Enterprise plans, Fable 5 is fully available now. On subscription plans the rollout is staged: included on Pro, Max, Team, and seat based Enterprise for a short window, then drawing on usage credits until capacity allows it back into the standard plans. Mythos 5 stays restricted to Glasswing partners and, soon, a small set of biology researchers.

Three things to check before you ship on Fable 5

Handle stop_reason: "refusal" as a normal response, not an exception.
Decide your fallback model now, either with the fallbacks param or SDK middleware.
Drop the thinking: {"type": "disabled"} path. It is not supported here.

If you want a deeper look at Claude Fable 5, I cover the features, pricing, and fallback behavior in more detail on my site.

If you want a model like this wired into a real product end to end, that is exactly the kind of work I take on.

Drop a comment if your fallback strategy looks different. Curious what people are routing to Opus versus keeping on Fable.

Top comments (4)

hao yang • Jun 10

Fallbacks are where the real DX lives. Nice table — saving this for my own stack picks.

Mudassir Khan • Jun 11

the trigger logic is what bites in practice — most stacks hardcode 429 as the fallback signal and miss the softer ones: the classifier returning stop_reason: 'refusal' comes back as a clean 200, not an exception. if your error handler only watches for HTTP failures, you're silently dropping refusals to the fallback model without knowing. worth wiring the refusal check separately from network error handling. the two failure modes need different logic. what's your current primary model today, or still evaluating?

Superdirector • Jun 11

This is interesting, thank you.

Mudassir Khan • Jun 18

the trigger logic section is where teams usually get caught, so glad that one landed. most hardcode 429 as the only fallback signal and miss the softer ones like a classifier stop_reason or p99 latency spikes. we added a timeout budget per hop and that caught two classes of degraded responses the status code alone never would. what stack are you building on?