Real data. One developer. One open-source Rust binary. The numbers don't lie.
Let me show you something.
A developer ran rtk gain after using a tool called RTK for a while. Here's what their dashboard showed:
Total commands: 15,720
Input tokens: 146.3M
Output tokens: 16.3M
Tokens saved: 130.0M (88.9%)
Total exec time: 788 minutes
Efficiency meter: ████████████░░ 88.9%
130 million tokens saved. 88.9% efficiency.
Not a benchmark. Not a simulation. An actual developer's real session history.
That's the receipt. Now let me explain what's happening — and why you should care.
The Leak Nobody Warned You About
You set up Claude Code, Cursor, or Copilot. You start building. You feel productive. The AI is running commands, reading files, making changes. It feels like the future.
What you don't see is the token leak happening on every single shell command.
Your AI runs git push. Git responds with:
Enumerating objects: 5, done.
Counting objects: 100% (5/5), done.
Delta compression using up to 8 threads
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 1.23 KiB | 1.23 MiB/s, done.
Total 3 (delta 2), reused 0 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (2/2), completed with 2 local objects.
To github.com:yourname/yourproject.git
abc1234..def5678 main -> main
The AI reads every word of that. Burns ~200 tokens on it. Just to learn one thing: the push worked.
The correct output was: ok main
This is happening dozens of times per session. Every ls, every cat, every grep, every test run — raw command output flooding into the LLM's context window, most of it noise.
The Real Numbers from That Dashboard
Look at the breakdown from that rtk gain screenshot more carefully:
| Command | Runs | Tokens Saved | Compression |
|---|---|---|---|
rtk gh pr diff |
72 | 18.1M | 72% |
rtk curl |
1 | 16.3M | 100% |
rtk read |
1,164 | 12.2M | 22.7% |
rtk curl (another) |
— | 10.5M | 100% |
rtk lint eslint |
16 | 9.1M | 99.7% |
rtk grep |
1,093 | 8.0M | 39.1% |
rtk eslint |
51 | 5.7M | 94.6% |
rtk git diff |
88 | 4.1M | 66.9% |
rtk read was called 1,164 times. rtk grep — 1,093 times. These aren't exotic commands. These are the bread-and-butter operations of any coding session, running constantly in the background while your AI agent works.
And every single one of those calls was leaking tokens — until RTK started intercepting them.
What RTK Actually Does
RTK (Rust Token Killer) is a CLI proxy that sits between your AI agent and the shell. It doesn't change what commands run — it changes what output the AI sees.
The hook is transparent. After a one-time setup, RTK silently rewrites commands before they execute:
-
git status→rtk git status→ compressed output -
cargo test→rtk test cargo test→ failures only -
grep "pattern"→rtk grep "pattern"→ grouped results -
cat file.rs→rtk read file.rs→ smart extraction
The AI never sees the rewrite. It just gets back clean signal instead of raw noise.
Four things happen to every output:
- Smart filtering — removes comments, whitespace, boilerplate
- Grouping — aggregates similar items (errors by type, files by directory)
- Truncation — keeps relevant context, cuts repetition
- Deduplication — collapses repeated log lines into counts
The result: git push goes from 15 lines to ok main. cargo test with 2 failures goes from 200+ lines to 20. ls becomes a clean directory tree instead of a wall of permissions and timestamps.
This Isn't Just About Cost
The token savings are real and they matter — especially if you're paying API costs in dollars while building in Kenya or anywhere else in Africa where every dollar of runway counts.
But there's something more interesting going on here.
Context quality affects reasoning quality.
Your AI agent isn't just storing what it reads — it's reasoning over it. A context window full of Git compression stats, verbose test output, and redundant file metadata is a noisier substrate for reasoning than a context window full of clean, relevant signal.
RTK doesn't just save money. It gives your AI agent a cleaner working memory, which means fewer hallucinations, better focus, and more accurate responses over long sessions.
The 88.9% efficiency in that dashboard? That's not just 88.9% fewer tokens spent. It's 88.9% less noise in the model's context on every one of those 15,720 commands.
Why It's Built in Rust (And Why That Matters)
This is a tool that intercepts every single shell command your AI agent runs. Performance is not optional.
RTK is a single Rust binary with zero runtime dependencies. No Python startup overhead. No npm install. No Docker. The proxy adds less than 10ms per command — genuinely imperceptible.
It's available for Linux (x86 and ARM64), macOS (Intel and Apple Silicon), and Windows. It's on Homebrew. It's at v0.35.0 with 96 releases, 19.5k GitHub stars, and 1.1k forks.
This is not an experiment. It's production-grade infrastructure that happens to be free and open source.
For the IDXhub Community Specifically
A lot of the RTK conversation online is focused on web dev stacks — TypeScript, Rust, Go. But the savings are entirely tool-agnostic.
If you're building IoT firmware, robotics systems, or embedded projects with AI assistance, you're running git, grep, cat, make, docker, kubectl constantly. Every one of those is a token leak RTK can close.
And given that we're operating in an environment where AI API costs hit harder relative to local incomes — squeezing 88.9% more value out of every dollar spent on tokens isn't a luxury. It's just smart engineering.
The tool is free. The repo is public. The install takes 60 seconds.
Install It Right Now
Linux / macOS:
curl -fsSL https://raw.githubusercontent.com/rtk-ai/rtk/refs/heads/master/install.sh | sh
Add to PATH:
echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.bashrc && source ~/.bashrc
Hook into Claude Code:
rtk init -g
Restart Claude Code. Done. Check your savings anytime:
rtk gain
For Cursor: rtk init -g --agent cursor
For Gemini CLI: rtk init -g --gemini
For Cline/Roo: rtk init --agent cline
The Question Worth Asking
Someone ran 15,720 commands through their AI coding agent. Without RTK, that would have been 146.3M tokens consumed. With RTK, it was 16.3M.
The difference is 130 million tokens. Real money. Real context saved. Real sessions that ran cleaner and longer.
The tool that did that is free, open source, and installs in one command.
What are you waiting for?
IDXhub is Kenya's hardware and tech developer community — 312+ builders working across IoT, robotics, and embedded systems. Follow us @IDXHUB and find us at idxhub.space.
Share this if you use Claude Code, Cursor, or any AI coding agent. This one's worth passing on.

Top comments (0)