Opus 4.7 Made Me Take Token Waste Management Seriously
Anthropic shipped Claude Opus 4.7 on April 16, 2026. Same per-token price as 4.6. New tokenizer. The official docs say it quietly: "This new tokenizer may use up to 35% more tokens for the same fixed text."
That line is what finally made me stop hand-waving about AI costs and actually audit where my tokens were going. I run Claude agents all day across my stack, and a silent 35% hike on the exact same prompts meant the ceiling for "good enough" just got a lot lower.
Here's what's in the full post:
The two ways tokens leak — I separate waste (turns that produced nothing useful) from inefficient usage (turns that worked but cost way more than they should have). Most guides collapse these into one bucket. They shouldn't.
How I measured it across 133,087 turns — I built a token waste sorter over 9,667 sessions and ran model-vs-model comparisons for the judgment calls. I explain the methodology, what I kept, what I threw out, and which clusters dominated the bleeding.
The top waste cluster was not what I expected — It wasn't hallucinations, bad reasoning, or runaway loops. It was infrastructure. Browser/Playwright failures spread thin across hundreds of cheap sessions outweighed every "smart" optimization I'd been chasing.
Three fixes that cost nothing — Shrinking CLAUDE.md, setting tight max_tokens, and auditing WebFetch failures. I share the before/after for each and which one moved the needle hardest.
The part that surprised me most is in the full post: the meta lesson about where budget actually goes when you measure instead of guess.
Full post: https://thoughts.jock.pl/p/token-waste-management-opus-47-2026
Free weekly: https://thoughts.jock.pl
Top comments (0)