Jangwook Kim

Posted on May 22 • Originally published at jangwook.net

I Tried RTK (Rust Token Killer) — A CLI Proxy That Cuts LLM Token Costs 60–90%

A month into using Claude Code, I opened the bill and flinched. The real culprit wasn't the API calls themselves — it was Bash command output packing the context window. find . -name "*.ts" spitting out hundreds of lines, cargo test dumping thousands — all tokens, all billable.

RTK (Rust Token Killer) directly targets that problem. It sits between your AI coding agent and the shell, compressing command output before it ever reaches the LLM. The claim: 60–90% token savings. I ran actual tests to see if that holds up.

What RTK Actually Does

One sentence: command output filter. It takes the results of git status, find, ls -la and strips everything the LLM doesn't need to understand the state of your project.

Four strategies do the work:

Smart Filtering: removes irrelevant lines (progress bars, timestamps, repeated headers)
Grouping: condenses similar items (28 .ts files instead of 28 individual lines)
Truncation: cuts output above a threshold, marks it ...(truncated)
Deduplication: removes repeated output patterns

Integration with Claude Code happens via a PreToolUse hook. If you understand how Claude Code hooks work, the design clicks immediately — rtk init -g registers it automatically under ~/.claude/hooks/. After that, when Claude Code runs git status, the hook intercepts it and rewrites it as rtk git status. Claude Code never knows.

Supported agents: Claude Code, Cursor, Windsurf, Cline, GitHub Copilot CLI, Gemini CLI, Antigravity, Hermes. Single Rust binary, zero runtime dependencies.

I Installed It and Measured

cargo install --git https://github.com/rtk-ai/rtk

Requires Rust toolchain. On my M3 Mac it took about 1 minute 40 seconds to compile. Verify:

rtk --version
# rtk 0.40.0

Before integrating with Claude Code, I ran manual benchmarks against my blog repository (Astro-based, 258 files).

Test 1: find command

find src/content/blog/ko/ -name "*.md" -type f

Original: 15,360 chars
RTK: 2,070 chars
Savings: 86.5% (99.9% in tokens)

Here's what RTK's find output looks like:

28F 1D:

./ claude-agent-sdk-subagents-orchestration-tutorial-2026.md
claude-api-prompt-caching-cost-optimization-guide.md
claude-code-agentic-workflow-patterns-5-types.md
...

It summarizes the count (28F 1D: 28 files, 1 directory) and compresses path prefixes. The dramatic savings come from the fact that find output is mostly repeated path segments.

Test 2: ls -la (large directory)

ls -la src/content/blog/ko/

Original: 23,848 chars (full permissions, owner, timestamps)
RTK: 12,069 chars
Savings: 49.4%

Drops the permissions string (drwxr-xr-x), owner (jangwook staff), and exact timestamps. Leaves filename and size. Reasonable — that's usually what the LLM actually needs.

Test 3: git status (small output)

Original: 274 chars
RTK: Actually grew larger (expanded tracked file list)
Savings: None (counterproductive)

This is important. Small outputs like git status get inflated by RTK rather than compressed.

Test 4: git log --oneline -20

Savings: 0% (passthrough)

Short, already-structured output passes through untouched.

Full stats (rtk gain):

RTK Token Savings (Global Scope)
════════════════════════════════════════════════════════════

Total commands:    6
Input tokens:      9.1K
Output tokens:     3.8K
Tokens saved:      5.5K (60.6%)
Total exec time:   153ms (avg 25ms)
Efficiency meter: ███████████████░░░░░░░░░ 60.6%

By Command
  rtk ls:    49.4% saved
  rtk find:  99.9% saved (in tokens)
  rtk git:   0% saved

60.6% average across 6 commands. But this test was skewed toward large-output commands (find, ls), so the headline number looks better than a real mixed workload would.

Where It Works and Where It Doesn't

Honestly, the advertised "60–90%" doesn't show up for every command. The effect splits sharply by command type.

Commands where RTK helps a lot:

Command	Why
`find`	Mostly repeated path prefixes → dramatic compression on summary
`ls -la` (large dirs)	Permissions and owner removal adds up
`cargo test`	Strips success cases, shows only failures
`npm test` / `jest`	Test summary compression
`docker ps`	Long container IDs and port info compressed
`grep -r` (large)	Context line removal

Commands where RTK barely helps:

Command	Why
`git status` (small changes)	Already short → nothing to compress
`git log --oneline`	Already condensed format
`cat` (single file)	Content is the point — can't compress
`echo`, `pwd`, simple commands	Passthrough

Project shape matters. TypeScript monorepo with frequent tsc --noEmit runs? RTK probably helps noticeably. Mostly API calls and small file edits? Savings will be minimal.

How to Integrate with Claude Code

Two steps.

Step 1: Install

cargo install --git https://github.com/rtk-ai/rtk

Step 2: Register the Claude Code hook

rtk init -g

-g means global (~/.claude/). Drop -g if you only want per-project application. It asks:

Whether to auto-patch ~/.claude/settings.json (say y)
Whether to create ~/.claude/RTK.md (Claude's awareness file for rtk usage)

Restart Claude Code. After that, all Bash tool calls are automatically rewritten through RTK. Run git status through Claude Code and check rtk gain — if it shows a new record, the integration worked.

One caveat: the hook only fires on Bash tool calls. Claude Code's built-in Read, Grep, and Glob tools bypass the hook entirely. Actual savings depend on how often your agent uses Bash for file exploration vs. native tools.

Another thing: there's a different Rust tool also named rtk — reachingforthejack/rtk (Rust Type Kit). If rtk gain doesn't work after installation, check for a name collision.

Putting RTK in Context

Three layers where you can cut LLM agent costs:

Model selection: switch to cheaper models (Haiku, Flash)
API layer: prompt caching, batch API, MCP schema compression
Shell layer: RTK (command output compression)

RTK is layer 3. If you haven't touched this layer yet, there's likely real savings here. If you've already optimized model choice and caching, RTK's marginal contribution gets smaller.

What I genuinely like: it's transparent. rtk init -g once, done. No changing how you write prompts. No manually prepending rtk to every command. Just use Claude Code the same way you always have. That kind of zero-behavior-change optimization has very low psychological friction.

Honest Limitations

A few things frustrated me after a few days of real use.

First: RTK's compression can strip important information. I had a failed test where the error stack trace was partially truncated. You can access raw output with rtk err <command> or rtk proxy <command>, but Claude Code can't automatically decide when it needs the full output.

Second: Installation requires Rust toolchain. No official binary releases as of v0.40.0. For teams without Rust in their stack, onboarding friction is real.

Third: Don't let the benchmark numbers mislead you. My actual Claude Code costs are dominated by Read, Grep, and code generation — areas RTK doesn't touch. The "60–90%" figure applies to find/ls-heavy tests. In a mixed real-world workload, 10–30% is the more honest estimate.

I wouldn't call RTK overhyped. But "cuts your Claude Code bill in half" is too much to expect. For projects with heavy file traversal or frequent test runs, it's genuinely useful. For everything else, single-digit savings is the likely outcome.

Finding Missed Savings: `rtk discover`

One underrated feature is discover. It analyzes your existing Claude Code session history and back-calculates how much you would have saved if those commands had run through RTK.

rtk discover

In my case, the biggest token wasters in my history were find, npm install (verbose logs), and docker logs. Of those, only find typically ran through Claude Code's Bash tool — the others I'd been running directly in terminal, outside the hook's reach.

You can also check per-session stats:

rtk session

This shows token savings and hook rewrite rates per Claude Code session. Oddly useful for understanding your own Bash-heavy vs. native-tool usage patterns.

RTK for Teams

Solo use is easy — install, rtk init -g, done. Team adoption needs more thought.

You'll want an onboarding script: guiding everyone from Rust install → RTK install → Claude Code hook registration. Put it in your Makefile or scripts/setup.sh:

#!/bin/bash
# scripts/setup-rtk.sh
if ! command -v cargo &> /dev/null; then
    echo "Rust required: https://rustup.rs"
    exit 1
fi

cargo install --git https://github.com/rtk-ai/rtk
rtk init -g --auto-patch
echo "RTK installed. Restart Claude Code."

Document it in CLAUDE.md: Noting RTK setup in the team's CLAUDE.md helps new members understand what's happening when commands get auto-rewritten. RTK's absence on any individual machine doesn't break Claude Code — it just means that person isn't saving tokens.

CI/CD: don't bother: RTK only matters for local dev environments where a developer is working with an AI agent. Don't add it to your CI pipeline.

The honest blocker for team adoption is Rust. "Install Rust and wait 1:40 for a compile" is a real ask for Python or Node.js teams. Official binary releases would fix this — but they don't exist as of v0.40.0.

How RTK Compares to Other Token Optimization Approaches

Where does RTK fit in the cost optimization stack?

Method	Layer	Applies To	Complexity	Expected Savings
Model downgrade (Haiku/Flash)	Pricing	All API calls	Low	5–20x (price diff)
Prompt caching	API	Repeated system prompts	Medium	40–70%
MCP schema compression (mcp2cli)	API	MCP tool injection	Medium	96–99%
RTK	Shell	Bash command output	Low	0–90% (varies)

Model downgrades have the biggest absolute impact but come with quality trade-offs. Prompt caching is powerful for repetitive workflows. MCP schema compression delivers dramatic savings if you're running many MCP tools. RTK has the lowest implementation friction and zero workflow change requirement.

As covered in the real cost of AI agents in production, agent costs accumulate across multiple factors. RTK addresses one slice of that — shell command output — and does it transparently. The right framing is "another layer in the cost stack," not a silver bullet.

Should You Install It?

My take: try it, but set realistic expectations. A 1 minute 40 second cargo install and one rtk init -g command. Check rtk gain after a week — if you're saving 10%+ keep it running, if not, rtk init -g --uninstall and move on.

RTK is a well-built tool. 50+ supported commands, clean agent integration, solid tracking. The expectation adjustment is the key thing: this is a "nice optimization layer," not a "cut your bill in half" move.

MIT license, free, open source. Nothing to lose.

GitHub: rtk-ai/rtk

Official site: rtk-ai.app

DEV Community

I Tried RTK (Rust Token Killer) — A CLI Proxy That Cuts LLM Token Costs 60–90%

What RTK Actually Does

I Installed It and Measured

Where It Works and Where It Doesn't

How to Integrate with Claude Code

Putting RTK in Context

Honest Limitations

Finding Missed Savings: `rtk discover`

RTK for Teams

How RTK Compares to Other Token Optimization Approaches

Should You Install It?

Top comments (0)

What RTK Actually Does

I Installed It and Measured

Where It Works and Where It Doesn't

How to Integrate with Claude Code

Putting RTK in Context

Honest Limitations

Finding Missed Savings: rtk discover

RTK for Teams

How RTK Compares to Other Token Optimization Approaches

Should You Install It?

Finding Missed Savings: `rtk discover`