Jamie

Posted on Mar 9

I Tracked Every AI Token I Spent for 30 Days — Here's What I Found

#ai #claude #openai #productivity

Last month I decided to track every single AI token I spent across Claude, OpenAI, and Google's APIs. Not just the monthly total — every individual request, in real time.

The results completely changed how I use AI coding tools.

The Setup

I'm a solo dev building macOS apps. I use Claude Code daily, occasionally hit the OpenAI API for specific tasks, and use Gemini for quick lookups. My monthly AI spend was hovering around $250-300 and I had no idea where it was going.

So I built a menu bar tool that tracks token costs per request in real time. Every time I send a prompt, I see the cost immediately — right there in my Mac's menu bar.

What I Found After 30 Days

Finding #1: 60% of my prompts didn't need expensive models

This was the biggest eye-opener. I was defaulting to Claude Opus for everything — simple file renames, grep operations, basic refactors, even asking "what does this error mean?"

These tasks produced identical output quality on Sonnet. But Opus costs roughly 15x more per token.

When I started routing simple tasks to cheaper models, my costs dropped immediately.

Finding #2: Context window loading was my biggest cost driver

I had Claude Code running in a directory with 3 projects. Every prompt was loading context from all of them. A "simple" question was actually sending 50K+ tokens of irrelevant context.

Fix: Separate terminals per project. This alone cut my per-session cost roughly in half.

Finding #3: Some 30-second prompts cost more than an hour of focused work

I had one prompt — "refactor this entire module to use the new API" — that cost $4.80 in a single request. Meanwhile, an entire hour of iterative debugging cost about $1.20.

The expensive prompts weren't the ones that took time. They were the lazy, vague ones that sent massive context.

Finding #4: Rate limits aren't random — they correlate with spend

Once I could see per-request costs, I noticed that rate limits hit right after clusters of expensive operations. The "random" throttling wasn't random at all — I was just burning through allocation on a few heavy prompts.

The Numbers

Before tracking (monthly average):

Total: ~$280/month
No idea which tasks cost what
Hit rate limits "randomly" 2-3 times per week

After 30 days of tracking:

Total: ~$165/month (41% reduction)
Routed 60% of tasks to cheaper models
Hit rate limits once in the entire month

Same output quality. Same productivity. Just smarter model selection.

How to Start Tracking

If you want to try this yourself, here's what I'd recommend:

Option 1: DIY with API logs

Most providers include token counts in response headers. You can log these and calculate costs with a simple script. It's manual but works.

Option 2: Use a dedicated tracker

I built my tracking tool into a Mac app called TokenBar that sits in the menu bar and shows real-time cost per request. It works with Claude, OpenAI, Google, and other providers.

The key insight is that you need per-request visibility, not just monthly totals. Monthly totals are like checking your bank balance once a month — you know you spent money, but you have no idea where.

The Takeaway

AI tools have worse cost visibility than cloud platforms did 10 years ago. You're expected to just trust that your $200/month subscription or API spend is reasonable, with almost no per-request breakdown.

Once you can see what each prompt actually costs, you naturally start making better decisions:

Use the cheapest model that handles each task
Keep context windows lean
Write specific prompts instead of vague ones
Separate projects into different sessions

The 41% cost reduction wasn't from using AI less. It was from using it smarter.

Have you tracked your AI token costs? What surprised you? Drop your numbers in the comments.

Links:

TokenBar — Real-time AI token cost tracking for Mac
Anthropic pricing
OpenAI pricing

DEV Community