Muhammad Awais

Posted on Jun 10 • Originally published at webtoolshub.online

Claude Fable 5 Just Dropped: API Model String, Pricing, Benchmarks & When to Use It

#ai #claude #webdev #programming

Yesterday, June 9, 2026, Anthropic shipped something it's never done before - it took a model from its locked-down "Mythos" tier and made it publicly available.

The model is called Claude Fable 5. The API string is claude-fable-5. And if you build anything with Claude, the benchmarks are worth your attention.

Andrej Karpathy called it "a major-version-bump-deserving step change forward." Let's look at what that actually means for your stack.

TL;DR

Model ID      : claude-fable-5
Context window: 1M tokens
Max output    : 128k tokens
Input pricing : $10 / million tokens
Output pricing: $50 / million tokens
Batch pricing : $5 / $25 per million tokens
Free on plans : Pro / Max / Team / Enterprise through June 22 only
Data retention: 30 days mandatory (no zero-retention exceptions)

What Is Mythos? Why Was It Locked?

Quick backstory. In April 2026, Anthropic quietly released Claude Mythos Preview through a program called Project Glasswing — restricted to vetted partners like AWS, Microsoft, Apple, and CrowdStrike.

Why locked? Because Mythos is genuinely capable at finding and chaining software vulnerabilities. Too powerful to ship without a safety layer first.

Fable 5 is the safety-wrapped public version of that same model.

Same underlying architecture as Mythos 5. The difference: safety classifiers that detect high-risk queries (cybersecurity exploits, biology, chemistry) and fall back to Claude Opus 4.8 for those. For everything else — coding, analysis, agents, documents — you get full Mythos-class performance.

Fallback fires in under 5% of sessions. For most production apps, it's invisible.

Start Using It in 30 Seconds

Drop-in replacement. Change one string in your existing code:

// Before
const model = "claude-opus-4-8";

// After
const model = "claude-fable-5";

Full example:

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

const response = await client.messages.create({
  model: "claude-fable-5",
  max_tokens: 8096,
  messages: [
    {
      role: "user",
      content: "Refactor this function to handle edge cases...",
    },
  ],
});

console.log(response.content[0].text);

On Amazon Bedrock: anthropic.claude-fable-5 or us.anthropic.claude-fable-5 (regional).

The messages API format is identical to every previous Claude model. No migration work.

Benchmark Numbers That Matter for Developers

I'm skipping the full table and giving you only the rows that affect real production decisions.

Agentic Coding

Benchmark	Fable 5	Opus 4.8	GPT-5.5	Gemini 3.1 Pro
SWE-Bench Pro	80.3%	69.2%	58.6%	54.2%
FrontierCode Diamond	29.3%	13.4%	5.7%	—
Terminal-Bench 2.1	88.0%	82.7%	83.4%	—

The SWE-Bench gap (80.3% vs 69.2%) is the most important number here. SWE-Bench Pro tests real GitHub issues end-to-end — not toy problems. An 11-point gap at this performance level is a different tier, not a marginal win.

FrontierCode Diamond is Cognition's benchmark for brutal production-coding tasks. Fable 5 scores more than double Opus 4.8 and 5x GPT-5.5.

Terminal-Bench is the exception. GPT-5.5 runs it through Codex CLI and scores 83.4% — genuinely competitive. For terminal-driven agentic workflows specifically, GPT-5.5 + Codex is still worth benchmarking.

Knowledge Work

Benchmark	Fable 5	Opus 4.8	GPT-5.5
GDPval-AA (white-collar tasks)	1932	1890	1769
Tool Use delta	+17.4pts over Opus	baseline	—

The tool use number matters for anyone building with MCP servers or multi-step agents. More on that below.

⚠️ The Asterisk Warning

Some numbers in Anthropic's official table have * — those belong to Mythos 5, not Fable 5. On cybersecurity, biology, and parts of reasoning, Fable 5 falls back to Opus 4.8 because of its safety classifiers. The starred numbers are the unconstrained model you can't access.

For normal developer work, this never matters. For security tooling or biotech apps — it does.

The June 22 Billing Cliff

This is time-sensitive. Read it.

Period	Pro / Max / Team / Enterprise
Now → June 22, 2026	Included free (counts 2x toward usage limits)
June 23 onwards	Usage credits required until capacity restores
API (always)	Pay-per-token at $10/$50 pricing

What to do before June 22: Run your actual production prompts through Fable 5 on claude.ai. Compare output quality to Opus 4.8 on your specific use cases. This is free benchmarking on real workloads that you can't get after the window closes.

If the quality improvement isn't visible on your tasks — stick with Opus 4.8. It's half the price.

Fable 5 vs Opus 4.8 — The Honest Routing Framework

The benchmark gap is real but it doesn't mean route everything to Fable 5. Here's how I'm thinking about it:

Use Sonnet 4.6 for:

High-volume, cost-sensitive calls (chat, summarization, content)
Tasks where you can evaluate quality yourself in seconds
Anything where latency matters more than maximum capability

Stay on Opus 4.8 for:

Moderate coding tasks where Sonnet falls short
Security-adjacent or bio queries (Fable 5 falls back to Opus here anyway)
Any workload with zero-retention compliance requirements
Everything where $5/$25 is the right price point

Reach for Fable 5 when:

Multi-file repository refactors or long agentic coding sessions
Tasks you'd otherwise delegate to a senior engineer for 2+ hours
Large document analysis using the full 1M context + 128k output
Production AI agents where per-step accuracy compounds across 20-30 steps

That last point is counterintuitive. A 10-point accuracy gain per step doesn't make your agent 10% better — it makes multi-step workflows dramatically more reliable because errors don't compound. The token premium can net out cheaper when it prevents 3-step recovery sequences.

Cost Math — Before You Commit

Fable 5 is double Opus 4.8's price:

Fable 5:  $10 input / $50 output per million tokens
Opus 4.8: $5  input / $25 output per million tokens

Three ways to control cost:

1. Prompt caching (90% input discount)
For apps with large repeated system prompts — detailed agent instructions, RAG context blocks — cache them. Cached input tokens cost $1/M instead of $10/M. This changes the economics significantly for prompt-heavy apps.

// Enable prompt caching on large system prompts
const response = await client.messages.create({
  model: "claude-fable-5",
  max_tokens: 4096,
  system: [
    {
      type: "text",
      text: yourLargeSystemPrompt,
      cache_control: { type: "ephemeral" }, // ← 90% discount on cache hits
    },
  ],
  messages: [{ role: "user", content: userMessage }],
});

2. Batch API for async workloads ($5/$25 pricing)
Document processing, offline analysis, bulk code review — anything that doesn't need a synchronous response. Half the price, same model.

3. Route intelligently
The teams that run Fable 5 cost-effectively don't use it as a default drop-in. They identify the specific task types where quality gains are visible and route only those. Everything else stays on Opus 4.8 or Sonnet 4.6.

Use the LLM API Cost Calculator on WebToolsHub to plug in your monthly token estimates and see the actual dollar delta before committing.

What This Means for Claude Code + MCP Users

If you're using Claude Code as your primary agentic coding environment, Fable 5 is a direct upgrade on the SWE-Bench tasks Claude Code handles — multi-file edits, test generation, repository-level refactors.

During the free window (now through June 22), Claude Code on eligible plans will automatically route to Fable 5. After June 23, check the model routing settings in your Claude Code configuration.

For MCP (Model Context Protocol) workflows, the +17.4 point tool use improvement matters. Multi-server agentic pipelines where the model has to choose between tools, chain results, and recover from partial failures will be noticeably more reliable with Fable 5 at the core.

Haven't set up MCP yet? The MCP guide for developers covers the full architecture.

Data Retention — The One Breaking Change

If your enterprise contract includes zero-retention terms, Fable 5 breaks that assumption.

Fable 5 and Mythos 5 carry mandatory 30-day data retention for safety monitoring purposes. Anthropic's own announcement confirms this applies even to organizations with existing zero-retention agreements.

If you handle PII, financial data, or other compliance-sensitive content:

Either keep that data on Opus 4.8 (which can still be zero-retention)
Or contact Anthropic's enterprise team before deploying Fable 5 in production

For most developer tools, side projects, and B2B SaaS without strict data compliance requirements, this doesn't matter. For anything touching regulated industries — it does.

Quick Reference Card

┌─────────────────────────────────────────────────────┐
│ Claude Fable 5 — Quick Reference (June 2026)        │
├─────────────────────────────────────────────────────┤
│ API string    claude-fable-5                        │
│ Context       1M tokens                             │
│ Max output    128k tokens                           │
│ Pricing       $10 input / $50 output per 1M tokens  │
│ Batch         $5 / $25 per 1M tokens                │
│ Cache hit     $1 input per 1M tokens (90% off)      │
│ Free window   Through June 22 on Pro/Max/Team       │
│ Retention     30 days mandatory                     │
│ SWE-Bench Pro 80.3% (vs 69.2% Opus 4.8)            │
│ Fallback      Opus 4.8 on restricted queries        │
└─────────────────────────────────────────────────────┘

Bottom Line

This is a genuine generational jump, not an incremental update. The SWE-Bench Pro number (80.3%) and the FrontierCode Diamond result (29.3% vs Opus 4.8's 13.4%) represent a meaningful capability step — the kind of gap that shows up in real multi-step agent tasks.

That said: don't route everything to it. Opus 4.8 at half the price is still the right call for 80% of what most teams do. The win condition with Fable 5 is knowing exactly which tasks justify the premium — and running those through it ruthlessly.

The free window through June 22 is your benchmark opportunity. Use it.

Full deployment guide with Next.js integration, agent workflow patterns, and cost analysis on WebToolsHub.

DEV Community