AI coding agents are best when they feel boring.
You ask for one change, the agent reads the right files, makes a small patch, runs the check, and stops.
The expensive sessions usually feel different. They feel busy. Claude Code, Codex, Cursor, or another agent keeps reading, planning, editing, and retrying. A lot is happening, but the actual diff is not getting cleaner.
That is when I try to use a stop loss.
I do not mean a perfect budget system. I mean a simple rule for when to interrupt the session before it turns into a foggy, expensive debugging spiral.
My basic rule
If token usage is climbing and I cannot explain the next patch in one sentence, I stop the agent.
That sounds obvious, but it catches a surprising amount of waste.
A good session has a short summary:
- fixed the settings crash by guarding the nil value
- added one failing test for the parser edge case
- replaced the slow query in the dashboard view
- cleaned up the retry path for failed uploads
A bad session has a vague summary:
- changed some state stuff
- tried a few approaches around auth
- refactored a bunch of helper code
- maybe fixed the test but also touched unrelated files
The second type of session is not always wrong, but it needs human attention again.
Why live usage matters
Most AI tool usage views are receipts.
They tell you what happened after the session is over.
That is useful for billing, but it does not help much while the agent is still making decisions.
The signal I want is live and boring:
- is this session getting weirdly expensive for the size of the change?
- is the model rereading the same files again?
- did the context balloon after a simple request?
- did we switch from fixing the bug to exploring the whole subsystem?
Those are workflow signals, not accounting signals.
Seeing them early changes what I do next.
The reset pattern
When a session crosses my stop loss, I usually do one of these:
- Ask the agent to summarize what it believes is true before it edits anything else
- Start a new session with only the relevant files and the failing command
- Change the instruction from "fix this" to "inspect and report only"
- Manually make the smallest patch, then ask the agent to write tests
- Split the task into one smaller target
The point is not to use fewer tokens at all costs.
The point is to spend tokens on work that still has shape.
Sometimes a big session is worth it. A migration, a gnarly bug, or a deep refactor can need a lot of context.
But if the token curve is growing faster than the clarity of the task, I treat that as a warning.
Why I put this in the menu bar
I built TokenBar because I wanted this signal while I was working, not after I was annoyed.
It is a small macOS menu bar app for watching AI usage from tools like Claude Code, Codex, Cursor, and other LLM workflows.
The useful part is not just seeing a big number.
The useful part is noticing the moment when a session stops being sharp.
For me, that moment is usually where the save happens.
Not because I avoid the next model call, but because I change the shape of the work before the agent keeps going in the wrong direction.
If you use AI coding tools on a Mac and want a live usage signal, TokenBar is here:
It is free to try. Pro is $15 lifetime.
The bigger lesson is product-agnostic though: every AI coding workflow needs some kind of stop loss.
Not to make the model cheaper.
To keep the work legible.
Top comments (0)