Josh H

Posted on May 12 • Originally published at github.com

Building a Cost Tracking CLI for Claude Code Sessions

#claude #devtools #cli #buildinpublic

I fell down the Claude Code rabbit hole like many developers over the past few months. However, after a few weeks, I opened my Anthropic bill and saw $120+ just from a handful of sessions.

The problem? I had zero visibility into what each session actually cost. I'd spin up Claude Code, work for an hour, then realize I'd burned through 50,000 tokens and had no idea which task or feature was responsible.

I wanted to fix that. So I built a CLI tool that automatically tracks Claude Code session costs and budgets them by task, inspired by YNAB's "give every dollar a job" principle. I'm calling it Tokenyst.

Here's what I learned building it.

The Problem: Claude Code is a Black Box

Claude Code is incredible, but it has a cost blindness problem. For developers who care about managing costs (especially indie hackers and bootstrappers), this is frustrating. You want to stay aware of spend the same way you would with any other service.

Traditional solutions don't help:

Anthropic's API dashboard shows aggregate usage, not per-session
Claude Code doesn't display token counts
Third-party tools either don't exist or require cloud accounts

I needed something local, automatic, and honest about what each session cost.

The Solution: Budget Per Task

I borrowed a concept from YNAB: instead of looking at past spending, assign money to tasks before you do the work.

With Tokenyst, you:

Create tasks with budgets: "Landing page redesign: $20", "Fix failing tests: $5"
Run Claude Code
Get automatic tracking: After each session, Tokenyst parses your transcript, calculates cost, and deducts it from your budget
See the impact: Finish under budget? Great. Over? You know exactly how much you exceeded

This shifted my mindset. I wasn't asking "how much did this cost?" after the fact. I was asking "how much can I spend?" before starting.

The Technical Side: How It Works

Here's where it got interesting. To track costs automatically, Tokenyst needs to:

Capture the session: Let Claude Code run normally
Find the transcript: Claude Code writes JSONL transcripts to ~/.claude/projects/
Parse token usage: Extract usage data from the transcript
Calculate cost: Apply current Claude model pricing (including prompt cache multipliers)
Update budget: Record the spend against the active task

Parsing Claude Code Transcripts

Claude Code stores transcripts as JSONL (JSON Lines). Each line is a complete event:

{"type":"user","content":"Build a React component...","id":"msg-1"}
{"type":"assistant","content":"Here's the component...","id":"msg-2","usage":{"input_tokens":450,"output_tokens":120}}

The challenge: streaming duplicates. While Claude writes the response, it emits multiple chunks for the same message with the same ID. You need to dedupe by message ID while preserving the final token counts.

// Simplified dedup logic
const messageMap = new Map();

for (const line of transcript) {
  const parsed = JSON.parse(line);
  if (parsed.id) {
    messageMap.set(parsed.id, parsed); // Last write wins
  }
}

const tokens = Array.from(messageMap.values())
  .filter(m => m.usage)
  .reduce((sum, m) => sum + m.usage.input_tokens + m.usage.output_tokens, 0);

Handling Prompt Cache Costs

Claude supports prompt cache, which changes pricing:

Cache creation tokens cost 1.25× the standard rate
Cache read tokens cost 0.1× the standard rate

This is crucial for accuracy. A session that heavily uses cache reads looks expensive in raw token count but is actually cheap.

const calculateCost = (model: string, usage: {
  inputTokens: number;
  cacheCreationTokens?: number;
  cacheReadTokens?: number;
  outputTokens: number;
}) => {
  const pricing = PRICING_TABLE[model];

  const inputCost = usage.inputTokens * pricing.input;
  const cacheCreationCost = (usage.cacheCreationTokens || 0) * pricing.input * 1.25;
  const cacheReadCost = (usage.cacheReadTokens || 0) * pricing.input * 0.1;
  const outputCost = usage.outputTokens * pricing.output;

  return inputCost + cacheCreationCost + cacheReadCost + outputCost;
};

Hook Integration

I wanted tracking to be automatic. Manually running a separate command after each session would be friction.

Claude Code supports Stop hooks, which are scripts that run when you exit. Tokenyst installs a hook that:

Fires automatically when you exit Claude Code
Reads the transcript path from stdin
Parses the transcript and records the cost
Returns a system message (optional feedback to Claude)

The hook is registered in ~/.claude/settings.json:

{
  "stop_hook": {
    "path": "/path/to/tokenyst",
    "command": "record-turn"
  }
}

When you exit Claude, the hook runs, captures the cost, and you see a summary:

✓ Session recorded
  Tokens: 2,450 input | 340 output
  Cost: $0.042
  Task: Landing page redesign
  Budget remaining: $19.96

No extra steps. No switching windows. Just automatic tracking.

What I Learned Building This

1. JSONL is surprisingly reliable

I expected Claude Code's transcript format to be fragile, but it's been stable. Each line is self-contained JSON, so parsing is straightforward. The dedup logic handles streaming correctly.

2. Local-first data matters

I could've built a web dashboard. But I chose local-first (~/.tokenyst/config.json) because:

No account, no authentication, no privacy concerns
Works offline
Portable (copy your config to another machine)
Zero infrastructure to maintain

Turns out, a lot of developers care about this.

3. Task budgeting changes behavior

The emotional arc of "finishing under budget" is real. When I'd check my budget after completing a task and I came in under, it felt like a win. This is why YNAB works so well. Same principle applies here.

4. Niche tools can still be valuable

Tokenyst isn't for everyone. It only matters if you:

Use Claude Code regularly
Care about costs
Prefer CLI tools over web dashboards

But for that specific audience, it solves a real problem.

Current State & Limitations

What works:

Automatic transcript capture and parsing
Cost calculation with cache multiplier support
Task budgeting and spend tracking
Local config file storage
Hook integration for zero-friction tracking

What's limited:

Only supports Claude Code
Pricing table must be manually updated when models change
No sync across devices (local-only for now)
No team features

I built this for myself, and it's been genuinely useful during development. But it's niche, and that's okay.

Try It

If you use Claude Code and want visibility into session costs:

GitHub: jher7/tokenyst

npm install -g tokenyst
tkst task    # Create and manage budgets
tkst claude  # Run Claude Code with tracking
tkst list    # See spend and remaining balance

No account, no cloud, no surprises. Just local cost tracking.

Have you hit the "expensive Claude Code session" problem? I'd love to hear if this approach resonates, or if you'd solve it differently. Issues and feedback welcome on GitHub.

Tags: #claude #devtools #cli #productivity #buildingpublic

Top comments (1)

Harjot Singh • Jun 1

i totally get the frustration of cost blindness with tools like Claude Code. visibility is key for managing budgets effectively. speaking of managing costs, at Moonshift, you can get a full next.js + postgres + auth app deployed in about 7 minutes, and you own the code on your github. if you're curious, i can set you up with a free run to try it out.