AI coding tools are getting very useful, but I kept running into one problem:
Expensive frontier models are often used for everything, including small file-scoped implementation patches.
That feels wasteful.
For many coding tasks, I want the strong model to stay in charge of planning and judgment, but I do not necessarily need it to write every narrow diff.
So I built TokenPatch.
GitHub: https://github.com/Leoyen1/tokenpatch
Website: https://tokenpatch.com
What it does
TokenPatch lets you keep using your current AI coding tool, such as Codex, Claude Code, Cursor, or MCP-capable coding agents.
The strong model still decides what should change.
TokenPatch then routes bounded implementation work to a cheaper executor, checks the patch locally, and reports what the useful change actually cost.
The core metric is:
cost per applied patch
Not just request cost.
Example
A task might look like this:
tp: change the page title. Only modify index.html.
A report can show:
Task: change page title, only modify index.html
All-strong estimate: $0.42
TokenPatch actual: $0.08
Saved: 81%
Patch applied: yes
Tests: passed
Why I built it
Most LLM cost tools focus on API requests.
But when coding with agents, I care more about task-level economics:
- Did the patch actually apply?
- Did it stay inside allowed files?
- Did it pass validation?
- How much did the accepted change cost?
- Would this have been more expensive if everything used the strong model?
That is the layer I wanted to explore.
Current status
TokenPatch is open source and BYOK-first.
You bring your own executor API key, currently DeepSeek-compatible, and TokenPatch runs locally.
Install from GitHub:
pip install git+https://github.com/Leoyen1/tokenpatch.git
Then run:
tokenpatch bootstrap
Then use it from your coding app:
tp: implement a small change. Only modify <file>.
What I am looking for
This is still early.
I am looking for feedback from developers who use AI coding tools regularly:
- Is “cost per applied patch” a useful metric?
- Is the setup too hard?
- Would you trust a cheaper executor if file boundaries are enforced?
- What coding-agent workflows should this support next?
If you try it, I would really appreciate feedback or issues on GitHub.
Top comments (1)
I am especially interested in feedback from Codex, Claude Code, and Cursor users.