I Analyzed My Claude Code Sessions and Found $143 in Hidden Agent Costs — Here's the Breakdown

#webdev #ai #opensource #tutorial

If you use Claude Code with API billing, you've probably had that moment: you check your Anthropic dashboard and think, "Wait, how did I spend that much on a single session?"

I had that moment last week. A session that felt like 20 minutes of light refactoring had burned through $47 in API calls. So I did what any developer would do — I dug into the logs.

What I found changed how I think about AI agent costs entirely.

The hidden architecture you're paying for

Claude Code doesn't just run one model conversation. Under the hood, it spawns sub-agents using the Task tool. Each sub-agent gets its own context window, its own token budget, and makes its own API calls.

Here's the thing: you can't see this happening in real time. The /cost command (when it works) shows session-level totals. If you're on a Max or Pro plan, you get even less — the cost display is disabled entirely.

Your session logs tell a different story. Every API call is recorded in JSONL files under ~/.claude/projects/*/sessions/*/conversation.jsonl. The data is all there. Nobody was surfacing it.

What my session logs actually showed

I wrote a parser that reads these JSONL files and attributes costs per model, per sub-agent, and per operation type. Here's what a real 9.9MB session looked like:

Metric	Value
Total conversation turns	457
Turns using Opus	412 (90%)
Turns using Sonnet	45 (10%)
Sub-agents spawned	84
Duplicate file reads	5
Estimated total cost	~$180
Cost if optimally routed	~$37

The biggest finding: 366 of those 412 Opus turns were doing simple tasks — reading files, running grep, making small edits. Sonnet handles these just as well at a fraction of the cost.

That's an 80% model-downgrade opportunity that's invisible without log analysis.

The sub-agent problem

Sub-agents are the real cost multiplier. Each one starts a fresh context, which means:

The system prompt gets re-sent (tokens you pay for again)
File contents get re-read (even if the parent agent just read them)
There's no shared memory between sub-agents running in parallel

In my session, 84 sub-agents spawned. Many were doing overlapping work — reading the same files, running similar searches. The logs showed clear redundancy, but Claude Code has no built-in way to detect or prevent this.

How to audit your own sessions

Your Claude Code session logs live here:

# List all session logs, newest first, with sizes
ls -lhS ~/.claude/projects/*/sessions/*/conversation.jsonl

Pick a large one (5MB+ sessions are where costs add up) and you can get a quick token count:

# Count conversation turns in a session
wc -l ~/.claude/projects/*/sessions/*/conversation.jsonl | sort -rn | head -5

For a proper cost breakdown, I turned this analysis into a service. You can see exactly what the output looks like without paying anything:

# Free sample report — no account, no payment, just curl
curl -s https://api.agentsconsultants.com/forensics/sample | jq .

The sample report shows every section: model usage breakdown, sub-agent analysis, downgrade opportunities, redundant operations, and ranked recommendations.

What I changed based on the data

After seeing the breakdown, I made three changes to my workflow:

1. Added model routing hints to CLAUDE.md

Use Sonnet for: file reads, grep/search, simple edits, test runs
Use Opus for: architecture decisions, complex debugging, multi-file refactors

This alone cut my next session's cost by roughly 60%.

2. Reduced sub-agent spawning

Do not spawn a sub-agent for tasks requiring fewer than 3 tool calls.
Prefer sequential execution over parallel when tasks share file context.

3. Ran /compact more aggressively

The duplicate file reads were a symptom of context bloat. Running /compact after every major code block kept the context lean and reduced redundant operations.

The bigger picture

There are open issues on the Claude Code repo asking for per-sub-agent cost tracking (#22625), cost visibility improvements (#10388), and programmatic token usage APIs. These are real needs that lots of developers share.

Until those features land natively, parsing your own logs is the best way to understand where your money is going. The data is already on your machine — it just needs to be surfaced.

Full disclosure: I built the analysis service linked above. It's part of an experiment in agent-to-agent micropayments using the x402 protocol. The free sample endpoint is genuinely free — no signup, no payment, just a curl command. I'd love feedback on what metrics are most useful to you.

Have you audited your Claude Code spending? I'm curious what patterns others are seeing. Drop a comment below.