Anthropic shipped Claude Sonnet 5, and the takes I saw were predictable:
"It replaces Opus."
"It is just another Sonnet refresh."
"The benchmark chart means you can route everything to it now."
Two of those are wrong. One is directionally right, but only if you care about cost per task instead of model prestige.
I spent time going through Anthropic's launch post, the Claude Platform docs, GitHub's Copilot rollout note, and the pricing math. The conclusion I landed on is simple: Sonnet 5 should be the default Claude model for most coding agents, but it should not be your highest-stakes escalation model.
TL;DR
- No, Sonnet 5 does not universally replace Opus 4.8. Anthropic says it can match Opus on some higher-effort tasks, not all tasks.
- Yes, the discount is real. Intro pricing is $2 input / $10 output per million tokens through August 31. Opus 4.8 is $5/$25.
- The real number is 60%. During the intro period, Sonnet 5 costs 40% of Opus 4.8, meaning a 60% discount on both input and output.
- After August 31, the math changes but still works. Sonnet 5 moves to $3/$15, still 40% cheaper than Opus 4.8.
- My routing rule: use Sonnet 5 for the first pass, Opus 4.8 for escalation, and Fable 5 only when the task justifies frontier-tier cost.
What actually shipped
Anthropic launched Claude Sonnet 5 on June 30, 2026.
The important part is not just the model. It is the availability.
Sonnet 5 is available across Claude Free, Pro, Max, Team, Enterprise, Claude Code, Claude Cowork, and the Claude Platform API, according to Anthropic's launch post. GitHub also made Sonnet 5 generally available in Copilot on June 30, which means this model landed directly inside developer workflows, not just API dashboards.
That matters because the frontier tier is noisy right now:
| Model / product | Current reality |
|---|---|
| Claude Fable 5 | Back online, but expensive and policy-sensitive |
| Claude Mythos 5 | Narrower access |
| GPT-5.6 | Gated preview, not broadly available |
| Gemini 3.5 Pro | Reported July target, not public API yet |
| Claude Sonnet 5 | Broadly available now |
This is why I care about Sonnet 5 more than the louder frontier-model drama.
It is the model developers can actually use this week.
The pricing table that changed my mind
The pricing is the story.
| Model | Input / 1M | Output / 1M | What it means |
|---|---|---|---|
| Claude Sonnet 5 intro | $2.00 | $10.00 | Through August 31, 2026 |
| Claude Sonnet 5 standard | $3.00 | $15.00 | After August 31 |
| Claude Sonnet 4.6 | $3.00 | $15.00 | Same as post-intro Sonnet 5 |
| Claude Opus 4.8 | $5.00 | $25.00 | Higher-end stable route |
| Claude Fable 5 | $10.00 | $50.00 | Frontier-priced route |
During the intro window, Sonnet 5 is not a small discount.
It is 60% cheaper than Opus 4.8.
After August 31, it is still 40% cheaper.
That is enough to change your default route even if you keep Opus for final review.
The $300/month example
Take a modest agent workload:
- 50M input tokens per month
- 10M output tokens per month
The bill:
Sonnet 5 intro = 50 * $2 + 10 * $10 = $200
Sonnet 5 standard = 50 * $3 + 10 * $15 = $300
Opus 4.8 = 50 * $5 + 10 * $25 = $500
That means:
| Route | Monthly cost | Savings vs Opus |
|---|---|---|
| Sonnet 5 intro | $200 | $300 |
| Sonnet 5 standard | $300 | $200 |
| Opus 4.8 | $500 | $0 |
If your team is running agents against repos every day, this is not theoretical.
It is the difference between routing every routine fix to Opus because "it is safer" and using Opus only when the first pass needs escalation.
The output-token trap
Most agent costs hide in output.
A coding agent does not just answer one question. It plans, edits, explains, retries, opens diffs, writes tests, and summarizes.
Suppose each run emits 12K output tokens and you run 5,000 agent tasks per month.
That is:
12,000 output tokens * 5,000 runs = 60,000,000 output tokens
Output-only cost:
Sonnet 5 intro = 60 * $10 = $600
Opus 4.8 = 60 * $25 = $1,500
That is a $900/month difference before counting input tokens.
I would rather spend that $900 on extra evals, better logging, or escalation for the tasks that actually need Opus.
The benchmark caveat people will skip
Anthropic says Sonnet 5 improves over Sonnet 4.6 and can match Opus 4.8 at higher effort on some agentic tasks.
That sentence has two important words: some tasks.
Anthropic also edited one launch chart after a methodology issue around BrowseComp. I do not read that as a scandal. I read it as a warning: do not build your routing policy from one vendor chart.
My benchmark policy for Sonnet 5 would be:
| Test set | Size | Pass condition |
|---|---|---|
| Bug fixes | 50 tasks | Same or better accepted patch rate |
| Repo Q&A | 50 tasks | Same or better factual accuracy |
| Code review | 50 tasks | Same or better defect catch rate |
| Refactors | 25 tasks | No higher regression rate |
| Long-context tasks | 25 tasks | No worse truncation or drift |
I do not need Sonnet 5 to beat Opus on every task.
I need it to be good enough for the first pass and cheap enough to run more often.
That is a very different requirement.
The "should I migrate?" decision tree
Here is the router I would start with.
def pick_claude_model(task):
if task in [
"repo_search",
"unit_test_fix",
"routine_refactor",
"doc_summary",
"first_pass_pr_review",
]:
return "claude-sonnet-5"
if task in [
"security_review",
"legal_reasoning",
"architecture_decision",
"final_pr_review",
]:
return "claude-opus-4.8"
if task == "frontier_research" and has_approved_fable_access():
return "claude-fable-5"
return "claude-sonnet-5"
That default is opinionated on purpose.
I do not want a router that starts expensive and occasionally tries cheaper models.
I want a router that starts with the cheap capable model, then escalates only when the task earns it.
Where I would not use Sonnet 5
Sonnet 5 is not the answer to everything.
| Workload | I would use instead | Why |
|---|---|---|
| Cheap summarization | Haiku or smaller route | Sonnet is overkill |
| Massive batch extraction | Batch + cheaper model | Price still compounds |
| Final high-stakes review | Opus 4.8 | Better escalation baseline |
| Approved frontier cyber work | Fable/Mythos route | Different capability tier |
| Open-weight local coding | GLM or Kimi route | Cost/control may win |
| Unverified benchmark chasing | Wait | Vendor charts are not enough |
This is the trap with every new model release.
People ask, "Is it better?"
The production question is, "Where is it good enough to become cheaper by default?"
For Sonnet 5, that answer is most routine agent work.
What I'd do if I were running a dev team this week
If I owned the model routing layer, I would do five things.
- Move routine Claude agent traffic from Sonnet 4.6 to Sonnet 5.
- Move first-pass Opus traffic to Sonnet 5 where evals pass.
- Keep Opus 4.8 as the escalation route for final review and high-stakes reasoning.
- Track accepted patch rate, retry rate, output tokens, and human review minutes.
- Re-run the cost model before August 31, because the intro price expires.
That last one matters.
The intro price makes migration look extremely obvious. The standard price still looks good, but the savings shrink.
| Date | Input / 1M | Output / 1M | Routing implication |
|---|---|---|---|
| Now through Aug. 31 | $2 | $10 | Aggressively test migration |
| After Aug. 31 | $3 | $15 | Still default, but re-check margins |
Do not let a temporary discount become an unmeasured permanent assumption.
The bigger picture
Sonnet 5 is part of a pattern I think more teams should notice.
The most important model in production is often not the strongest model. It is the model with the best mix of availability, cost, latency, and enough intelligence for the common path.
That is why Sonnet 5 matters.
Fable 5 is more dramatic. GPT-5.6 is more mysterious. Gemini 3.5 Pro will probably get the launch-week attention when it lands.
But Sonnet 5 is the boring model that can lower a lot of real bills.
And boring models that lower bills tend to win production traffic.
Disclosure
If you want to swap between Claude, OpenAI, Gemini, DeepSeek, Qwen, GLM and other models through one OpenAI-compatible endpoint, that is roughly what TokenMix does. Disclosure: I work on the research side. Full cited breakdown is on the original article.
Bottom line
Claude Sonnet 5 should be your default Claude agent route, not your prestige model and not your only model.
Use it for first-pass coding, refactors, PR review, repo Q&A, and routine tool use. Keep Opus 4.8 for escalation. Keep Fable 5 for the narrow slice that justifies frontier-tier cost.
The model release is good. The routing discipline is what saves the money.
Would you route routine coding agents to Sonnet 5 by default, or keep paying for Opus until independent evals catch up?
Top comments (0)