How I Caught a $30 AI Agent Bug in 10 Seconds with a Menu Bar App

#macos

Last week I was debugging an AI agent chain. The agent was supposed to make 3-4 tool calls per request. Instead, it was looping — retrying the same failed call over and over.

I didn't notice for about 90 seconds. In that time, it burned through roughly $30 in Claude API tokens.

The problem? I had no real-time visibility into what was happening. My provider dashboard updates with a delay. My logging was async and I wasn't watching the terminal output closely enough.

What I changed

I started using TokenBar — a native macOS menu bar app I built that shows live token usage as API calls happen.

Now my workflow looks like this:

Start a dev session
Glance at the menu bar — token counter is ticking
If usage spikes unexpectedly, I see it immediately
I stop the process, fix the bug, and save money

That $30 bug? With TokenBar running, I would have caught it in under 10 seconds — the counter would have spiked visibly in the menu bar.

Why this matters more with agents

Traditional LLM usage (single prompt → single response) is relatively predictable. But AI agents introduce:

Retry loops — failed tool calls that trigger repeated attempts
Context accumulation — each step adds to the conversation, inflating token counts
Recursive chains — agents calling sub-agents, each with their own token budget
Unpredictable branching — the agent decides what to do next, and sometimes it decides wrong

All of these can cause token usage to spike in ways you don't expect. Having real-time visibility isn't a nice-to-have anymore — it's essential.