DEV Community

SingYee
SingYee

Posted on

I analyzed 187 Claude Code sessions. $6,744 worth of tokens. Here's where they actually went.

I've been using Claude Code heavily for the past month. Building trading bots, automation tools, side projects.

I knew I was burning through tokens but never looked at the numbers.

So I built a small CLI to parse my local session data. The result: 187 sessions. 3.3 billion tokens. $6,744 equivalent API cost.

 I'm on Max, so this is equivalent API cost, not what I actually paid. But the token patterns are what matter here.

97% of my tokens were something I couldn't control

That was the first surprise. 97% were cache reads. Every turn, Claude re-reads the entire conversation context. Think of it like re-reading an entire book every time you turn a page.

The good news: cache reads are cheap ($1.5/M tokens) and completely normal. The bad news: it means the part you can actually control is tiny.

Only 2.8% of my tokens were controllable. Of that, 92.5% was cache creation (CLAUDE.md, MCP tools, system prompt loading), 6.6% was Claude's actual output, 0.9% was my input.

What I wouldn't have caught from /cost

This was the most useful part:

  • 86 sessions over 30 turns without /compact, each one letting context balloon to 2-3x what it needed to be
  • 840 subagent calls, every single one duplicating the full conversation context just to do a search
  • 35 anomaly sessions burning tokens at 2-3x my normal rate
  • Bash was 40% of all tool calls, pumping long command outputs back into context every time
  • Peak hours (Mon-Fri 5-11am PT) used 1.3x more tokens on average than off-peak

What I actually changed

After seeing the data, three things:

  1. I use /compact after ~20 turns now instead of letting sessions run endlessly
  2. I stopped defaulting to Agent for codebase searches and use Grep/Glob directly
  3. I try to keep heavy sessions out of peak hours when possible

Small changes, but the anomaly sessions have mostly stopped showing up.

The tool

Open sourced it. Called ccwhy, written in Rust, runs completely offline on your local ~/.claude/ data. No API keys needed.

brew install SingggggYee/tap/ccwhy
Enter fullscreen mode Exit fullscreen mode

Or: cargo install ccwhy

Or: grab the binary

It's not a replacement for ccusage. ccusage tells you how much you spent. ccwhy tells you why, and what to change.

GitHub

Curious what other people's breakdowns look like. Is 97% cache reads normal, or is my setup unusually heavy?

Top comments (1)

Collapse
 
delimit profile image
Delimit.ai

Fascinating breakdown on those 187 Claude Code sessions—it's eye-opening how much token spend goes into iterative building like trading bots and automation tools.

To optimize costs in similar workflows, focus on pre-planning your prompts with clear specs upfront, which can cut down on redundant iterations based on your analysis.

I've seen this approach halve token usage in my own projects without sacrificing output quality.