korjwl1

Posted on May 8

TOKI - Fast and Lightweight Token Inspector ecosystem for Vibe Coders

#ai #showdev #vibecoding #claude

TL;DR: I built Toki — a Rust daemon that replaces ccusage for Claude Code / Codex token tracking. ~1,742× faster on warm queries, near-zero overhead. Two optional layers sit on top: toki-monitor (macOS menu bar app) and toki-sync (self-hostable sync server). FSL-1.1-Apache-2.0, solo project.

The problem: ccusage was making my laptop crawl

I'm a Korean dev who burns a lot of tokens in Claude Code and Codex. At my workplace we'd been keeping a half-joking ccusage leaderboard of who was burning the most that week. It was fun — until we wired it into a hook to auto-run on every session, and our laptops started visibly slowing down.

Looking at ccusage's design, the cause was clear. Every invocation re-walks ~/.claude and ~/.codex and re-parses every JSONL session file from scratch, single-threaded. The Zig multi-threaded fork that tries to fix this still re-reads every byte and scales memory linearly with input.

OpenTelemetry would be cleaner architecturally, but it needs a central collector, can't see anything from before you set it up, and is overkill for "I just want to see my own tokens."

Toki — the daemon that replaces ccusage

The name "Toki" is short for Token Inspector, and it's also the Korean word for rabbit (토끼) — hence the project mascot.

Toki is a direct, drop-in replacement for ccusage. It's a Rust daemon that indexes your sessions incrementally and serves reports from an embedded LSM store (fjall, used as a TSDB). Docker-style architecture: one daemon ingests, multiple clients read.

toki daemon start    # ingests via FSEvents/inotify
toki report          # instant TSDB query
toki trace           # live event stream (stdout / UDS / HTTP)
toki query 'sum by (model)(toki_tokens_total[1d])'   # PromQL-ish

Toki is the only piece that replaces ccusage. The two other repos in this post (toki-monitor, toki-sync) are optional layers built on top of Toki, not separate competitors to anything.

Benchmarks

On a 2GB session dataset (M1 MacBook Air, sudo purge between runs):

Metric	Toki	ccusage	zzusage
Cold-start	1.54 s	21.5 s	1.41 s
Report query (warm)	~12 ms	21.5 s	—
Idle RAM	5 MB	(no idle state)	(no idle state)
Idle CPU	0%	—	—
DB on disk	~3% of source	—	—

Translation: the daemon stays out of the way. Your laptop fan won't spin from this; you only feel it when you actually run a query, and queries come back fast enough that you barely notice the round trip.

The report query is constant-time regardless of dataset size — it hits the TSDB index, not the source files. So 1,742× on 2 GB of sessions, and the speedup keeps growing as your history grows.

How it works

The JSONL pipeline is where most of the engineering went.

1. Cold-start: parallel mmap'd parsing

Cold-start uses mmap + Rayon to walk session files in parallel. Lines without token-usage fields get skipped before serde ever touches them — most JSONL lines (thinking blocks, plain text content) don't carry usage, so this single filter is a meaningful win.

2. Selective serde on token-bearing lines

For the lines that do contain usage, the serde decoder is configured to skip every string field except the token counts. Prompts, responses, and thinking blocks are never even allocated.

Privacy comes from the parser, not from policy — and it's a real speed win on big files.

3. xxHash3 fingerprint + reverse-from-EOF resume

For incremental resume, the daemon stores an xxHash3 fingerprint of the last line it read. On the next run it scans backward from EOF and matches by hash, so even if a session file got compacted from the front, the right resume point is found in O(1) hash compares. No re-scanning everything just to figure out where we left off.

Architecture details — writer thread topology, trace fan-out, query parser — are documented in DESIGN.md. The short version: one writer thread per provider, separate UDS read path, so reports never block ingestion.

On top of Toki: toki-monitor (macOS menu bar GUI)

The CLI isn't for everyone, so I built toki-monitor — a macOS menu bar app that runs on top of the Toki daemon.

The starting idea was something like RunCat but for token usage: a pixel rabbit in your menu bar that runs faster the more tokens you burn. As I built it, I kept finding things that make sense for AI tools but wouldn't for RunCat.

The rabbit sleeps when you stop using AI for a while (zZ animation), because "0 tokens" is a meaningful state for an AI tool while "0 CPU" never really is.

An HP bar above the character shows how much of your Claude or Codex window is left, and depletes green → red as your weekly window burns down.

Optional warning system: a hit animation when your $/min crosses a threshold you set, and a poison effect when usage spikes anomalously vs your 24h baseline.

You can render Claude and Codex as two separate characters in different colors, or merge them into one. Numeric and sparkline modes per provider if the mascot isn't your thing.

There's also a dashboard panel system (still beta) with PromQL-style queries, time-series charts, project breakdowns, and Liquid Glass styling for macOS Tahoe.

The dashboard is functional but rough at the edges, and the menu bar animations aren't as polished as I'd like — solo dev, sprite art is genuinely not my strength. Contributions on the visual side would be amazing — new menu bar characters, animation refinements, dashboard polish, anything visual. The theme system is modular, so adding a new character is mostly drag-and-drop PNGs plus a small theme.json.

On top of Toki: toki-sync (self-hostable sync server, beta)

There's also toki-sync for multi-device users, team analytics, or running a "who's burning the most tokens this week" leaderboard with friends.

Custom binary TCP protocol (bincode + zstd + ACK flow control + delta-sync on reconnect), runs on AWS free-tier EC2 with fjall, ClickHouse backend for bigger deployments. Built-in auth + RBAC + REST API on top, so you can build your own dashboard. Still beta and the README is rough.

Try it

brew tap korjwl1/tap
brew install toki                  # CLI daemon (the ccusage replacement)
brew install --cask toki-monitor   # optional macOS GUI on top
toki daemon start && toki report

Linux + macOS for the daemon (toki-monitor is macOS-only). Claude Code + Codex CLI supported today, Gemini CLI is next.

Status & contributions

Solo project, FSL-1.1-Apache-2.0 — full source, free for personal and commercial use, converts to Apache 2.0 after two years.

Contributions of any kind very welcome:

Code — issues / PRs across all three repos
Sprite art — new menu bar characters for toki-monitor (theme system is documented and modular)
Animation — refinements and polish
Translations — currently EN / KO, more languages welcome

If you try it and run into anything weird, please file an issue. Happy to answer questions in the comments 🐇

Repos:

korjwl1/toki — the daemon (Toki proper, replaces ccusage)
korjwl1/toki-monitor — macOS menu bar GUI, on top of Toki
korjwl1/toki-sync — sync server, on top of Toki (beta)

Top comments (3)

John • May 16

The reverse-from-EOF resume is the detail I like most here. A lot of token dashboards get the UI right but still punish you with repeated full rescans, which means people stop running them exactly when the history gets useful.

korjwl1 • May 21

I also built this system to reduce the unnecessary File I/O. CCusage recently got big updates, but still, if you want to build dashboards on that, a lot of duplicate File I/O occurs

Jill Mercer • May 9

token burn is the silent killer when you're deep in the flow — nothing ruins a vibe session like an unexpected api bill. i've been living in cursor lately but keeping an eye on cli token usage is always a headache. rust for a daemon is a smart move for keeping things snappy. since you built such a focused tool, you should toss it on stackapps.app — i'm a pilot there and we love indie tools that do one thing well. full disclosure — i help run stackapps and only share it when it actually fits.