tokenmixai

Posted on Jul 2 • Originally published at tokenmix.ai

I Did the Math on Claude Sonnet 5. The 60% Opus Discount Is Real, But Temporary.

#ai #claude #anthropic #programming

Anthropic shipped Claude Sonnet 5, and the takes I saw were predictable:

"It replaces Opus."

"It is just another Sonnet refresh."

"The benchmark chart means you can route everything to it now."

Two of those are wrong. One is directionally right, but only if you care about cost per task instead of model prestige.

I spent time going through Anthropic's launch post, the Claude Platform docs, GitHub's Copilot rollout note, and the pricing math. The conclusion I landed on is simple: Sonnet 5 should be the default Claude model for most coding agents, but it should not be your highest-stakes escalation model.

TL;DR

No, Sonnet 5 does not universally replace Opus 4.8. Anthropic says it can match Opus on some higher-effort tasks, not all tasks.
Yes, the discount is real. Intro pricing is $2 input / $10 output per million tokens through August 31. Opus 4.8 is $5/$25.
The real number is 60%. During the intro period, Sonnet 5 costs 40% of Opus 4.8, meaning a 60% discount on both input and output.
After August 31, the math changes but still works. Sonnet 5 moves to $3/$15, still 40% cheaper than Opus 4.8.
My routing rule: use Sonnet 5 for the first pass, Opus 4.8 for escalation, and Fable 5 only when the task justifies frontier-tier cost.

What actually shipped

Anthropic launched Claude Sonnet 5 on June 30, 2026.

The important part is not just the model. It is the availability.

Sonnet 5 is available across Claude Free, Pro, Max, Team, Enterprise, Claude Code, Claude Cowork, and the Claude Platform API, according to Anthropic's launch post. GitHub also made Sonnet 5 generally available in Copilot on June 30, which means this model landed directly inside developer workflows, not just API dashboards.

That matters because the frontier tier is noisy right now:

Model / product	Current reality
Claude Fable 5	Back online, but expensive and policy-sensitive
Claude Mythos 5	Narrower access
GPT-5.6	Gated preview, not broadly available
Gemini 3.5 Pro	Reported July target, not public API yet
Claude Sonnet 5	Broadly available now

This is why I care about Sonnet 5 more than the louder frontier-model drama.

It is the model developers can actually use this week.

The pricing table that changed my mind

The pricing is the story.

Model	Input / 1M	Output / 1M	What it means
Claude Sonnet 5 intro	$2.00	$10.00	Through August 31, 2026
Claude Sonnet 5 standard	$3.00	$15.00	After August 31
Claude Sonnet 4.6	$3.00	$15.00	Same as post-intro Sonnet 5
Claude Opus 4.8	$5.00	$25.00	Higher-end stable route
Claude Fable 5	$10.00	$50.00	Frontier-priced route

During the intro window, Sonnet 5 is not a small discount.

It is 60% cheaper than Opus 4.8.

After August 31, it is still 40% cheaper.

That is enough to change your default route even if you keep Opus for final review.

The $300/month example

Take a modest agent workload:

50M input tokens per month
10M output tokens per month

The bill:

Sonnet 5 intro = 50 * $2 + 10 * $10 = $200
Sonnet 5 standard = 50 * $3 + 10 * $15 = $300
Opus 4.8 = 50 * $5 + 10 * $25 = $500

That means:

Route	Monthly cost	Savings vs Opus
Sonnet 5 intro	$200	$300
Sonnet 5 standard	$300	$200
Opus 4.8	$500	$0

If your team is running agents against repos every day, this is not theoretical.

It is the difference between routing every routine fix to Opus because "it is safer" and using Opus only when the first pass needs escalation.

The output-token trap

Most agent costs hide in output.

A coding agent does not just answer one question. It plans, edits, explains, retries, opens diffs, writes tests, and summarizes.

Suppose each run emits 12K output tokens and you run 5,000 agent tasks per month.

That is:

12,000 output tokens * 5,000 runs = 60,000,000 output tokens

Output-only cost:

Sonnet 5 intro = 60 * $10 = $600
Opus 4.8 = 60 * $25 = $1,500

That is a $900/month difference before counting input tokens.

I would rather spend that $900 on extra evals, better logging, or escalation for the tasks that actually need Opus.

The benchmark caveat people will skip

Anthropic says Sonnet 5 improves over Sonnet 4.6 and can match Opus 4.8 at higher effort on some agentic tasks.

That sentence has two important words: some tasks.

Anthropic also edited one launch chart after a methodology issue around BrowseComp. I do not read that as a scandal. I read it as a warning: do not build your routing policy from one vendor chart.

My benchmark policy for Sonnet 5 would be:

Test set	Size	Pass condition
Bug fixes	50 tasks	Same or better accepted patch rate
Repo Q&A	50 tasks	Same or better factual accuracy
Code review	50 tasks	Same or better defect catch rate
Refactors	25 tasks	No higher regression rate
Long-context tasks	25 tasks	No worse truncation or drift

I do not need Sonnet 5 to beat Opus on every task.

I need it to be good enough for the first pass and cheap enough to run more often.

That is a very different requirement.

The "should I migrate?" decision tree

Here is the router I would start with.

def pick_claude_model(task):
    if task in [
        "repo_search",
        "unit_test_fix",
        "routine_refactor",
        "doc_summary",
        "first_pass_pr_review",
    ]:
        return "claude-sonnet-5"

    if task in [
        "security_review",
        "legal_reasoning",
        "architecture_decision",
        "final_pr_review",
    ]:
        return "claude-opus-4.8"

    if task == "frontier_research" and has_approved_fable_access():
        return "claude-fable-5"

    return "claude-sonnet-5"

That default is opinionated on purpose.

I do not want a router that starts expensive and occasionally tries cheaper models.

I want a router that starts with the cheap capable model, then escalates only when the task earns it.

Where I would not use Sonnet 5

Sonnet 5 is not the answer to everything.

Workload	I would use instead	Why
Cheap summarization	Haiku or smaller route	Sonnet is overkill
Massive batch extraction	Batch + cheaper model	Price still compounds
Final high-stakes review	Opus 4.8	Better escalation baseline
Approved frontier cyber work	Fable/Mythos route	Different capability tier
Open-weight local coding	GLM or Kimi route	Cost/control may win
Unverified benchmark chasing	Wait	Vendor charts are not enough

This is the trap with every new model release.

People ask, "Is it better?"

The production question is, "Where is it good enough to become cheaper by default?"

For Sonnet 5, that answer is most routine agent work.

What I'd do if I were running a dev team this week

If I owned the model routing layer, I would do five things.

Move routine Claude agent traffic from Sonnet 4.6 to Sonnet 5.
Move first-pass Opus traffic to Sonnet 5 where evals pass.
Keep Opus 4.8 as the escalation route for final review and high-stakes reasoning.
Track accepted patch rate, retry rate, output tokens, and human review minutes.
Re-run the cost model before August 31, because the intro price expires.

That last one matters.

The intro price makes migration look extremely obvious. The standard price still looks good, but the savings shrink.

Date	Input / 1M	Output / 1M	Routing implication
Now through Aug. 31	$2	$10	Aggressively test migration
After Aug. 31	$3	$15	Still default, but re-check margins

Do not let a temporary discount become an unmeasured permanent assumption.

The bigger picture

Sonnet 5 is part of a pattern I think more teams should notice.

The most important model in production is often not the strongest model. It is the model with the best mix of availability, cost, latency, and enough intelligence for the common path.

That is why Sonnet 5 matters.

Fable 5 is more dramatic. GPT-5.6 is more mysterious. Gemini 3.5 Pro will probably get the launch-week attention when it lands.

But Sonnet 5 is the boring model that can lower a lot of real bills.

And boring models that lower bills tend to win production traffic.

Disclosure

If you want to swap between Claude, OpenAI, Gemini, DeepSeek, Qwen, GLM and other models through one OpenAI-compatible endpoint, that is roughly what TokenMix does. Disclosure: I work on the research side. Full cited breakdown is on the original article.

Bottom line

Claude Sonnet 5 should be your default Claude agent route, not your prestige model and not your only model.

Use it for first-pass coding, refactors, PR review, repo Q&A, and routine tool use. Keep Opus 4.8 for escalation. Keep Fable 5 for the narrow slice that justifies frontier-tier cost.

The model release is good. The routing discipline is what saves the money.

Would you route routine coding agents to Sonnet 5 by default, or keep paying for Opus until independent evals catch up?

DEV Community