Parsing 2 GiB/s of AI token usage with TUI like ccusage (Rust + simd-json)

#ai #rust #opensource #cli

The Problem

I use Claude Code, Codex CLI, and Gemini CLI daily. One day I checked my API bill — it was way higher than expected. But I had no idea where the tokens were going.

Existing tracking tools were too slow. Scanning my 3 GB of session files (9,000+ files across three CLIs) took over 40 seconds. I wanted something instant.

So I built toktrack — a terminal-native token usage tracker that parses everything locally at 2 GiB/s.

The Data

Each AI CLI stores session data differently:

CLI	Location	Format
Claude Code	`~/.claude/projects/*/.jsonl`	JSONL, per-message usage
Codex CLI	`~/.codex/sessions/*/.jsonl`	JSONL, cumulative counters
Gemini CLI	`~/.gemini/tmp//chats/.json`	JSON, includes thinking_tokens

A single Claude Code session file can look like this:

{"timestamp":"2026-01-15T10:00:00Z","message":{"model":"claude-sonnet-4-20250514","usage":{"input_tokens":12000,"output_tokens":3500,"cache_read_input_tokens":8000,"cache_creation_input_tokens":2000}},"costUSD":0.042}

Multiply this by thousands of sessions over months, and you're looking at gigabytes of JSONL to parse.

Why simd-json

Standard serde_json is good. But when you're parsing 3 GB of line-delimited JSON, every microsecond per line adds up.

simd-json is a Rust port of simdjson that uses SIMD instructions (AVX2, SSE4.2, NEON) to parse JSON significantly faster. The key trick: in-place parsing with mutable buffers.

#[derive(Deserialize)]
struct ClaudeJsonLine<'a> {
    timestamp: &'a str,              // borrowed, zero-copy
    #[serde(rename = "requestId")]
    request_id: Option<&'a str>,     // borrowed, zero-copy
    message: Option<ClaudeMessage<'a>>,
    #[serde(rename = "costUSD")]
    cost_usd: Option<f64>,
}

By using &'a str instead of String, we avoid heap allocations for every field. simd-json parses the JSON in-place on a mutable byte buffer, and our structs just borrow slices from that buffer.

The one gotcha: simd-json's from_slice requires &mut [u8], so you need to own a mutable copy of each line:

let reader = BufReader::new(File::open(path)?);
for line in reader.lines() {
    let line = line?;
    let mut bytes = line.into_bytes();  // owned, mutable
    if let Ok(parsed) = simd_json::from_slice::<ClaudeJsonLine>(&mut bytes) {
        // extract what we need, bytes are consumed
    }
}

This gave a 17-25% throughput improvement over standard serde_json on my dataset.

Adding Parallelism with rayon

A single-threaded parser hit ~1 GiB/s. But with 9,000+ files, we can parallelize at the file level trivially using rayon:

use rayon::prelude::*;

let entries: Vec<UsageEntry> = files
    .par_iter()
    .flat_map(|f| parser.parse_file(f).unwrap_or_default())
    .collect();

That's it. rayon's par_iter() distributes files across threads automatically. Combined with simd-json, this pushed throughput to ~2 GiB/s — a 3.2x improvement over sequential parsing.

Stage	Throughput
serde_json (baseline)	~800 MiB/s
simd-json (zero-copy)	~1.0 GiB/s
simd-json + rayon	~2.0 GiB/s

The Hard Part: Each CLI is Different

The real complexity wasn't parsing speed — it was handling three completely different data formats behind a single trait:

pub trait CLIParser: Send + Sync {
    fn name(&self) -> &str;
    fn data_dir(&self) -> PathBuf;
    fn file_pattern(&self) -> &str;
    fn parse_file(&self, path: &Path) -> Result<Vec<UsageEntry>>;
}

Claude Code is straightforward — each JSONL line with a message.usage field is one API call.

Codex CLI was tricky. Token counts are cumulative — each token_count event reports the running total, not a delta. And the model name is in a separate turn_context line. So parsing is stateful:

line 1: session_meta  → extract session_id
line 2: turn_context  → extract model name
line 3: event_msg     → token_count (cumulative total)
line 4: event_msg     → token_count (larger cumulative total)

You need to keep only the last token_count per session.

Gemini CLI uses standard JSON (not JSONL) with a unique thinking_tokens field that no other CLI tracks.

TUI with ratatui

For the dashboard, I used ratatui to build 4 views:

Overview — Total tokens/cost with a GitHub-style 52-week heatmap
Models — Per-model breakdown with percentage bars
Daily — Scrollable table with sparkline charts
Stats — Key metrics in a card grid

The heatmap uses 2x2 Unicode block characters to fit 52 weeks of data in a compact space, with percentile-based color intensity.

Results

On my machine (Apple Silicon, 9,000+ files, 3.4 GB total):

	Time
Cold start (no cache)	~1.2s
Warm start (cached)	~0.05s

The caching layer stores daily summaries in ~/.toktrack/cache/. Past dates are immutable — only today is recomputed. This means even when Claude Code deletes session files after 30 days, your cost history survives.