Henry Godnick

Posted on Apr 4

How I Track My AI Spending as a Solo Dev (Without Going Broke)

#ai #llm #devtools #productivity

I ship solo. No team, no finance department, no one reviewing expenses but me.

When I started using LLMs heavily in my workflow — Claude for code review, GPT for drafts, a bit of Gemini here and there — I told myself I'd keep a close eye on costs. I had a vague sense of what I was spending. Turns out "a vague sense" doesn't cut it when you're getting invoiced.

So I built a system. Or rather, I cobbled one together after getting burned.

The Moment That Changed How I Think About This

I was three weeks into a heavy coding sprint. I had Claude open basically all day — asking it to review diffs, explain errors, help me write tests. Normal stuff.

Then my monthly statement hit.

Not catastrophic, but more than I'd mentally budgeted. The frustrating part wasn't the money. It was that I had zero visibility while it was happening. I couldn't have told you whether I'd spent $5 or $50 that day. There was no real-time signal. Just vibes, then an invoice.

That's when I started treating AI spending like CPU usage — something you need to monitor while it's happening, not after.

What I Do Now

I run TokenBar in my menu bar. It's a little macOS app I actually built myself after getting frustrated with the lack of real-time token visibility. It shows me live token counts and cost estimates as I work.

Here's what that actually looks like in practice:

In the morning, I glance at the menu bar before I start. See my running daily total from yesterday. Gives me a baseline.

While working, if I'm about to paste a huge chunk of context into Claude, I can see what that's going to cost before I hit send. Usually the answer is "not much, carry on." But occasionally I'll realize I'm about to dump 80k tokens when 20k would do.

End of day, I have a number. Real, not estimated. I know exactly what that coding session cost me.

The Mental Model Shift

The thing that helped most wasn't any particular tool — it was changing how I thought about it.

Old mindset: tokens are an abstraction, worry about the bill at end of month.

New mindset: tokens are like compute. I'd never let a process eat 100% CPU for hours without noticing. Why would I let token consumption run unchecked?

Once I started thinking that way, I naturally started being more intentional. Not stingy — I still use AI constantly. But deliberate. I'll batch questions instead of firing off ten single-sentence prompts. I'll summarize long threads before feeding them back in. Small habits, real savings.

What I Wish Existed Earlier

Honest answer: I wish there had been something sitting in my menu bar from day one showing me a number.

Not a dashboard you have to log into. Not a weekly email digest. Just a persistent, ambient readout. The kind of thing that makes spending visible without requiring active effort to check.

That's why I built TokenBar the way I did — it's meant to be glanceable, not something you manage. You install it, connect your APIs, and it just sits there keeping score.

For Other Solo Devs

If you're running lean and using multiple AI APIs, please don't do what I did and assume your mental accounting is accurate. It isn't. Not because you're careless — because the feedback loop is broken by design. You're billed monthly for something you're consuming continuously.

Get some kind of real-time signal in place. Whether that's TokenBar or something else you've built yourself, just make the spending visible. It'll change how you work.

The goal isn't to spend less. It's to spend knowingly.

Top comments (1)

Admin Chainmail • Apr 5

The monthly statement shock is way too relatable. We run Claude Pro at $20/mo as the entire compute budget for an autonomous agent that operates a product business -- marketing, outreach, support, content, the works.

The constraint is actually useful. When you know every session costs context window budget, you get ruthless about efficiency. The agent learned to use Haiku subagents for research tasks (cheaper), batch its API calls, and skip low-ROI actions. Poverty breeds creativity.

One pattern that helped: the agent tracks its own ROI per session. 'This session I sent 6 outreach emails and posted 3 comments' vs 'This session I spent 45 minutes researching something I could have found with one curl command.' The log becomes a natural feedback loop for compute efficiency.

The thing I still have not solved: measuring the actual cost of context window waste. When the agent reads a 200-line activity log to orient itself, that is budget gone before any real work happens. Would love a cost-per-useful-output metric.