DEV Community

Cover image for Someone Saved 130 Million Tokens Using This Free Tool. Here's What It Does
Bravin kulei
Bravin kulei

Posted on

Someone Saved 130 Million Tokens Using This Free Tool. Here's What It Does

Real data. One developer. One open-source Rust binary. The numbers don't lie.


Let me show you something.

A developer ran rtk gain after using a tool called RTK for a while. Here's what their dashboard showed:

!

Total commands:     15,720
Input tokens:       146.3M
Output tokens:      16.3M
Tokens saved:       130.0M  (88.9%)
Total exec time:    788 minutes
Efficiency meter:   ████████████░░  88.9%
Enter fullscreen mode Exit fullscreen mode

130 million tokens saved. 88.9% efficiency.

Not a benchmark. Not a simulation. An actual developer's real session history.

That's the receipt. Now let me explain what's happening — and why you should care.


The Leak Nobody Warned You About

You set up Claude Code, Cursor, or Copilot. You start building. You feel productive. The AI is running commands, reading files, making changes. It feels like the future.

What you don't see is the token leak happening on every single shell command.

Your AI runs git push. Git responds with:

Enumerating objects: 5, done.
Counting objects: 100% (5/5), done.
Delta compression using up to 8 threads
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 1.23 KiB | 1.23 MiB/s, done.
Total 3 (delta 2), reused 0 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (2/2), completed with 2 local objects.
To github.com:yourname/yourproject.git
   abc1234..def5678  main -> main
Enter fullscreen mode Exit fullscreen mode

The AI reads every word of that. Burns ~200 tokens on it. Just to learn one thing: the push worked.

The correct output was: ok main

This is happening dozens of times per session. Every ls, every cat, every grep, every test run — raw command output flooding into the LLM's context window, most of it noise.


The Real Numbers from That Dashboard

Look at the breakdown from that rtk gain screenshot more carefully:

Command Runs Tokens Saved Compression
rtk gh pr diff 72 18.1M 72%
rtk curl 1 16.3M 100%
rtk read 1,164 12.2M 22.7%
rtk curl (another) 10.5M 100%
rtk lint eslint 16 9.1M 99.7%
rtk grep 1,093 8.0M 39.1%
rtk eslint 51 5.7M 94.6%
rtk git diff 88 4.1M 66.9%

rtk read was called 1,164 times. rtk grep1,093 times. These aren't exotic commands. These are the bread-and-butter operations of any coding session, running constantly in the background while your AI agent works.

And every single one of those calls was leaking tokens — until RTK started intercepting them.


What RTK Actually Does

RTK (Rust Token Killer) is a CLI proxy that sits between your AI agent and the shell. It doesn't change what commands run — it changes what output the AI sees.

The hook is transparent. After a one-time setup, RTK silently rewrites commands before they execute:

  • git statusrtk git status → compressed output
  • cargo testrtk test cargo test → failures only
  • grep "pattern"rtk grep "pattern" → grouped results
  • cat file.rsrtk read file.rs → smart extraction

The AI never sees the rewrite. It just gets back clean signal instead of raw noise.

Four things happen to every output:

  1. Smart filtering — removes comments, whitespace, boilerplate
  2. Grouping — aggregates similar items (errors by type, files by directory)
  3. Truncation — keeps relevant context, cuts repetition
  4. Deduplication — collapses repeated log lines into counts

The result: git push goes from 15 lines to ok main. cargo test with 2 failures goes from 200+ lines to 20. ls becomes a clean directory tree instead of a wall of permissions and timestamps.


This Isn't Just About Cost

The token savings are real and they matter — especially if you're paying API costs in dollars while building in Kenya or anywhere else in Africa where every dollar of runway counts.

But there's something more interesting going on here.

Context quality affects reasoning quality.

Your AI agent isn't just storing what it reads — it's reasoning over it. A context window full of Git compression stats, verbose test output, and redundant file metadata is a noisier substrate for reasoning than a context window full of clean, relevant signal.

RTK doesn't just save money. It gives your AI agent a cleaner working memory, which means fewer hallucinations, better focus, and more accurate responses over long sessions.

The 88.9% efficiency in that dashboard? That's not just 88.9% fewer tokens spent. It's 88.9% less noise in the model's context on every one of those 15,720 commands.


Why It's Built in Rust (And Why That Matters)

This is a tool that intercepts every single shell command your AI agent runs. Performance is not optional.

RTK is a single Rust binary with zero runtime dependencies. No Python startup overhead. No npm install. No Docker. The proxy adds less than 10ms per command — genuinely imperceptible.

It's available for Linux (x86 and ARM64), macOS (Intel and Apple Silicon), and Windows. It's on Homebrew. It's at v0.35.0 with 96 releases, 19.5k GitHub stars, and 1.1k forks.

This is not an experiment. It's production-grade infrastructure that happens to be free and open source.


For the IDXhub Community Specifically

A lot of the RTK conversation online is focused on web dev stacks — TypeScript, Rust, Go. But the savings are entirely tool-agnostic.

If you're building IoT firmware, robotics systems, or embedded projects with AI assistance, you're running git, grep, cat, make, docker, kubectl constantly. Every one of those is a token leak RTK can close.

And given that we're operating in an environment where AI API costs hit harder relative to local incomes — squeezing 88.9% more value out of every dollar spent on tokens isn't a luxury. It's just smart engineering.

The tool is free. The repo is public. The install takes 60 seconds.


Install It Right Now

Linux / macOS:

curl -fsSL https://raw.githubusercontent.com/rtk-ai/rtk/refs/heads/master/install.sh | sh
Enter fullscreen mode Exit fullscreen mode

Add to PATH:

echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.bashrc && source ~/.bashrc
Enter fullscreen mode Exit fullscreen mode

Hook into Claude Code:

rtk init -g
Enter fullscreen mode Exit fullscreen mode

Restart Claude Code. Done. Check your savings anytime:

rtk gain
Enter fullscreen mode Exit fullscreen mode

For Cursor: rtk init -g --agent cursor

For Gemini CLI: rtk init -g --gemini

For Cline/Roo: rtk init --agent cline


The Question Worth Asking

Someone ran 15,720 commands through their AI coding agent. Without RTK, that would have been 146.3M tokens consumed. With RTK, it was 16.3M.

The difference is 130 million tokens. Real money. Real context saved. Real sessions that ran cleaner and longer.

The tool that did that is free, open source, and installs in one command.

What are you waiting for?

github.com/rtk-ai/rtk


IDXhub is Kenya's hardware and tech developer community — 312+ builders working across IoT, robotics, and embedded systems. Follow us @IDXHUB and find us at idxhub.space.


Share this if you use Claude Code, Cursor, or any AI coding agent. This one's worth passing on.

Top comments (0)