π I just open-sourced something I built out of frustration β and it's saving me real money every day.https://github.com/borhen68/TokenTamer
If you use AI coding agents (Cursor, Claude Code, Aiderβ¦), you've probably noticed how fast your API bills stack up. The problem? Your agent keeps re-sending the same files β full source code β every single turn. You're paying for tokens you already paid for.
So I built TokenTamer β a drop-in proxy that sits between your agent and the LLM API and quietly cuts token usage by 50β80%. No config changes. No code changes. Just point your API base URL at it and go. β
Here's what it does under the hood:
π§ AST-based compression β strips function bodies from background files, keeps only signatures. The LLM knows what exists, without reading every line.
π§ Tool-aware compression β skeletonizes stale file reads, keeps the latest one intact.
πΎ Prompt cache hijacking β injects Anthropic cache breakpoints so long Claude Code sessions hit the cache instead of paying full price (~73% off on long runs).
π° Real-time dashboard β watch tokens saved and dollars saved live in your terminal.
It's MIT licensed, works with OpenAI + Anthropic APIs, and takes 5 minutes to set up.
π Star it, try it, break it β and tell me what you think.
Top comments (0)