Before I blame the model, I check the token trail
One thing I have learned from building with LLMs every day is that a bad session usually does not start with a bad answer.
It starts with drift.
A little too much old context.
A little too much tool output carried forward.
A little too much reluctance to restart a chat that is already getting muddy.
When that happens, I usually feel the workflow get worse before I notice the cost.
The model feels slower.
The answers get less sharp.
I start rewriting prompts that were not really the problem.
For a while I treated this as a prompting problem.
Then I realized I was missing a live signal.
Most token dashboards are useful after the fact. They tell you what happened once the session is already over.
That is helpful for reporting, but not for behavior.
I wanted something visible while I was actually working.
So I built TokenBar, a macOS menu bar app that shows live token usage during LLM sessions.
That one change made a few habits much clearer for me:
- restart chats sooner when context starts dragging
- summarize instead of carrying full tool traces forward
- stay on smaller models longer when the task does not need more
- notice when a workflow is getting sloppy before the bill shows up
It is not a magic optimizer.
It just makes token usage hard to ignore in the moment.
That has been more useful for me than any postmortem chart.
Because by the time I am looking at a chart, the messy workflow already happened.
If you are building with AI all day, I think live visibility changes behavior faster than after the fact spend reports.
TokenBar is here if you want to try it: https://tokenbar.site/
Top comments (0)