Claude Sonnet 5 is cheaper per word but can cost more per finished job

#anthropic #claude #modelrelease #agents

Anthropic released Claude Sonnet 5 this week, calling it its most agentic Sonnet yet — a model built less for one-shot chat and more for multi-step tool-using work. Sonnet 5 approaches the far pricier Opus 4.8 on hands-on agent tasks, and it is now the default model for Free and Pro users.

Key facts

What: Anthropic's new mid-tier model is close to its flagship on hard agent work, yet independent testing shows it can spend more per completed task because it takes more steps.
When: 2026-06-30
Primary source: read the source

The headline pricing looks like a bargain. Sonnet 5 costs a fraction of Opus per word of input and output, with an introductory rate that is cheaper still. It keeps the very large memory of its predecessor — a million words of context — and it adds a new extra-high effort setting for when you want it to think harder. On paper, it is a workhorse that got smarter without getting more expensive.

Independent testers found a more complicated picture. The analysis firm Artificial Analysis ran Sonnet 5 through real agent tasks and published a detailed cost breakdown with a surprising conclusion: once the introductory discount ends, Sonnet 5 can cost slightly more to finish a task than Opus 4.8, the supposedly premium model, even though each individual word is far cheaper. A task's bill equals words-per-word-price multiplied by how many words the model uses, and Sonnet 5 uses a lot more of them. On their tasks it produced roughly forty percent more output and took about three times as many back-and-forth steps to get the job done. Cheaper ingredients, bigger recipe, similar total. This is the crucial difference between cost per word and cost per finished job, and it is easy to get burned by it if you only read the sticker price.

A second wrinkle compounds this. Sonnet 5 ships with an updated tokenizer — the component that chops your text into the units the model counts and bills by. The new one can turn the same sentence into up to a third more billable units than before, which nudges the effective cost up again. Anthropic set the introductory price partly to cushion that transition, so the true running cost only becomes clear once the promo ends. For an explanation of why the same paragraph can cost different amounts on different models, see our explainer on tokenization.

The safety story drew its own debate. Anthropic went out of its way to say Sonnet 5 is worse at cybersecurity tasks than its Opus models — it is harder to coax into helping with an attack, and cyber safeguards are on by default. The company presents this as a deliberate win: your everyday default model should be less able to cause harm. On Hacker News that framing got a mixed reception. Some applauded a lab bragging about a model being safer rather than just stronger. Others were skeptical, half-joking that it is a strange thing to advertise, and a few worried that a model deliberately dialed down on security reasoning might also write less secure code for legitimate developers. The line between will not help you attack and cannot reason about security is thin.

Sonnet 5 is a solid, incremental update to the model most people will actually use day to day, and being free on the default tier is a real gift to casual users. But it is also a small case study in how model economics have gotten slippery. Cheaper per word does not mean cheaper per result; a longer, more agentic model can quietly out-spend a pricier one that finishes in fewer moves. For anyone paying the bill on automated agents, the lesson is to measure cost per completed task, not cost per word — and to re-measure after the introductory pricing lapses. For more on why an agent's step count matters so much, see our lesson on what makes an AI an agent.

Originally published on Ground Truth, where every claim is checked against the primary source.