DEV Community

John
John

Posted on

Before I blame the model, I check the token trail

Before I blame the model, I check the token trail

One thing I have learned from building with LLMs every day is that a bad session usually does not start with a bad answer.

It starts with drift.

A little too much old context.
A little too much tool output carried forward.
A little too much reluctance to restart a chat that is already getting muddy.

When that happens, I usually feel the workflow get worse before I notice the cost.

The model feels slower.
The answers get less sharp.
I start rewriting prompts that were not really the problem.

For a while I treated this as a prompting problem.
Then I realized I was missing a live signal.

Most token dashboards are useful after the fact. They tell you what happened once the session is already over.
That is helpful for reporting, but not for behavior.

I wanted something visible while I was actually working.
So I built TokenBar, a macOS menu bar app that shows live token usage during LLM sessions.

That one change made a few habits much clearer for me:

  • restart chats sooner when context starts dragging
  • summarize instead of carrying full tool traces forward
  • stay on smaller models longer when the task does not need more
  • notice when a workflow is getting sloppy before the bill shows up

It is not a magic optimizer.
It just makes token usage hard to ignore in the moment.

That has been more useful for me than any postmortem chart.
Because by the time I am looking at a chart, the messy workflow already happened.

If you are building with AI all day, I think live visibility changes behavior faster than after the fact spend reports.

TokenBar is here if you want to try it: https://tokenbar.site/

Top comments (0)