I fell down the Claude Code rabbit hole like many developers over the past few months. However, after a few weeks, I opened my Anthropic bill and saw $120+ just from a handful of sessions.
The problem? I had zero visibility into what each session actually cost. I'd spin up Claude Code, work for an hour, then realize I'd burned through 50,000 tokens and had no idea which task or feature was responsible.
I wanted to fix that. So I built a CLI tool that automatically tracks Claude Code session costs and budgets them by task, inspired by YNAB's "give every dollar a job" principle. I'm calling it Tokenyst.
Here's what I learned building it.
The Problem: Claude Code is a Black Box
Claude Code is incredible, but it has a cost blindness problem. For developers who care about managing costs (especially indie hackers and bootstrappers), this is frustrating. You want to stay aware of spend the same way you would with any other service.
Traditional solutions don't help:
- Anthropic's API dashboard shows aggregate usage, not per-session
- Claude Code doesn't display token counts
- Third-party tools either don't exist or require cloud accounts
I needed something local, automatic, and honest about what each session cost.
The Solution: Budget Per Task
I borrowed a concept from YNAB: instead of looking at past spending, assign money to tasks before you do the work.
With Tokenyst, you:
- Create tasks with budgets: "Landing page redesign: $20", "Fix failing tests: $5"
- Run Claude Code
- Get automatic tracking: After each session, Tokenyst parses your transcript, calculates cost, and deducts it from your budget
- See the impact: Finish under budget? Great. Over? You know exactly how much you exceeded
This shifted my mindset. I wasn't asking "how much did this cost?" after the fact. I was asking "how much can I spend?" before starting.
The Technical Side: How It Works
Here's where it got interesting. To track costs automatically, Tokenyst needs to:
- Capture the session: Let Claude Code run normally
-
Find the transcript: Claude Code writes JSONL transcripts to
~/.claude/projects/ - Parse token usage: Extract usage data from the transcript
- Calculate cost: Apply current Claude model pricing (including prompt cache multipliers)
- Update budget: Record the spend against the active task
Parsing Claude Code Transcripts
Claude Code stores transcripts as JSONL (JSON Lines). Each line is a complete event:
{"type":"user","content":"Build a React component...","id":"msg-1"}
{"type":"assistant","content":"Here's the component...","id":"msg-2","usage":{"input_tokens":450,"output_tokens":120}}
The challenge: streaming duplicates. While Claude writes the response, it emits multiple chunks for the same message with the same ID. You need to dedupe by message ID while preserving the final token counts.
// Simplified dedup logic
const messageMap = new Map();
for (const line of transcript) {
const parsed = JSON.parse(line);
if (parsed.id) {
messageMap.set(parsed.id, parsed); // Last write wins
}
}
const tokens = Array.from(messageMap.values())
.filter(m => m.usage)
.reduce((sum, m) => sum + m.usage.input_tokens + m.usage.output_tokens, 0);
Handling Prompt Cache Costs
Claude supports prompt cache, which changes pricing:
- Cache creation tokens cost 1.25× the standard rate
- Cache read tokens cost 0.1× the standard rate
This is crucial for accuracy. A session that heavily uses cache reads looks expensive in raw token count but is actually cheap.
const calculateCost = (model: string, usage: {
inputTokens: number;
cacheCreationTokens?: number;
cacheReadTokens?: number;
outputTokens: number;
}) => {
const pricing = PRICING_TABLE[model];
const inputCost = usage.inputTokens * pricing.input;
const cacheCreationCost = (usage.cacheCreationTokens || 0) * pricing.input * 1.25;
const cacheReadCost = (usage.cacheReadTokens || 0) * pricing.input * 0.1;
const outputCost = usage.outputTokens * pricing.output;
return inputCost + cacheCreationCost + cacheReadCost + outputCost;
};
Hook Integration
I wanted tracking to be automatic. Manually running a separate command after each session would be friction.
Claude Code supports Stop hooks, which are scripts that run when you exit. Tokenyst installs a hook that:
- Fires automatically when you exit Claude Code
- Reads the transcript path from stdin
- Parses the transcript and records the cost
- Returns a system message (optional feedback to Claude)
The hook is registered in ~/.claude/settings.json:
{
"stop_hook": {
"path": "/path/to/tokenyst",
"command": "record-turn"
}
}
When you exit Claude, the hook runs, captures the cost, and you see a summary:
✓ Session recorded
Tokens: 2,450 input | 340 output
Cost: $0.042
Task: Landing page redesign
Budget remaining: $19.96
No extra steps. No switching windows. Just automatic tracking.
What I Learned Building This
1. JSONL is surprisingly reliable
I expected Claude Code's transcript format to be fragile, but it's been stable. Each line is self-contained JSON, so parsing is straightforward. The dedup logic handles streaming correctly.
2. Local-first data matters
I could've built a web dashboard. But I chose local-first (~/.tokenyst/config.json) because:
- No account, no authentication, no privacy concerns
- Works offline
- Portable (copy your config to another machine)
- Zero infrastructure to maintain
Turns out, a lot of developers care about this.
3. Task budgeting changes behavior
The emotional arc of "finishing under budget" is real. When I'd check my budget after completing a task and I came in under, it felt like a win. This is why YNAB works so well. Same principle applies here.
4. Niche tools can still be valuable
Tokenyst isn't for everyone. It only matters if you:
- Use Claude Code regularly
- Care about costs
- Prefer CLI tools over web dashboards
But for that specific audience, it solves a real problem.
Current State & Limitations
What works:
- Automatic transcript capture and parsing
- Cost calculation with cache multiplier support
- Task budgeting and spend tracking
- Local config file storage
- Hook integration for zero-friction tracking
What's limited:
- Only supports Claude Code
- Pricing table must be manually updated when models change
- No sync across devices (local-only for now)
- No team features
I built this for myself, and it's been genuinely useful during development. But it's niche, and that's okay.
Try It
If you use Claude Code and want visibility into session costs:
GitHub: jher7/tokenyst
npm install -g tokenyst
tkst task # Create and manage budgets
tkst claude # Run Claude Code with tracking
tkst list # See spend and remaining balance
No account, no cloud, no surprises. Just local cost tracking.
Have you hit the "expensive Claude Code session" problem? I'd love to hear if this approach resonates, or if you'd solve it differently. Issues and feedback welcome on GitHub.
Tags: #claude #devtools #cli #productivity #buildingpublic
Top comments (0)