Most AI cost talk still focuses on the prompt.
That is only part of the bill.
What kept burning tokens for me was everything around the prompt:
- old context I should have trimmed
- tool output I kept dragging forward
- retrieval chunks that stopped being useful 20 minutes ago
- switching to a bigger model before the task actually needed it
The annoying part is that none of this feels expensive in the moment.
A session just gets a little messier.
A little slower.
A little harder to reason about.
Then the token count quietly runs up.
That is why I built TokenBar.
It sits in the macOS menu bar and shows live token usage while I work with LLMs. Not after the session. During it.
That changes behavior faster than a dashboard ever did for me.
When I can see token usage climbing in real time, I am more likely to:
- cut dead context
- restart a bloated thread
- stay on a smaller model longer
- stop carrying tool traces that are no longer helping
For me, the first win was not even cost.
It was cleaner AI workflows.
If you are building with LLMs all day, that live feedback loop matters more than another after the fact report.
TokenBar: https://tokenbar.site/
Top comments (0)