DEV Community

Henry Godnick
Henry Godnick

Posted on

I Thought I Had an AI Bill Problem. I Actually Had 7 Workflow Leaks.

If your AI spend feels random, this is for you.

I’m a solo builder and I kept saying: “models are too expensive now.”
Reality: my workflow was leaking money.

Same with focus. I kept saying: “I just need more discipline.”
Reality: my feeds were stealing deep-work windows.

Here are the 7 leaks I fixed.

1) The “open 20 files and pray” loop

I used to dump huge context into Claude/Cursor/Codex, then rerun prompts when output was vague.
That burned tokens fast.

Fix:

  • scope each request to 1 small task
  • send only files needed for that task
  • kill reruns unless I changed context

2) No session-level budget

I had monthly spend in mind, but no per-session guardrail.

Fix:

  • set a hard session budget before starting
  • track token/cost in real-time (I use TokenBar in my Mac menu bar)
  • stop when budget hits and rewrite prompt instead of brute forcing

3) Treating retries like progress

Retries felt productive. They weren’t.

Fix:

  • if output is off twice, rewrite constraints and acceptance criteria
  • “what should this code do?” beats “try again”

4) Hidden doomscroll tax between coding blocks

I’d check one feed “for 2 minutes.” 25 minutes gone, brain context reset.

Fix:

  • block algorithmic feeds during work windows (I use Monk Mode)
  • keep comms tools available, kill infinite feeds

5) Prompting before thinking

I prompted first, clarified later.

Fix:

  • write a 5-line spec first
  • include explicit done condition
  • ask model for plan, then execution

6) Letting novelty drive model choice

I chased whatever model was trending that day.

Fix:

  • pick one default model for the week
  • switch only when benchmarked on my own task

7) No post-mortem after expensive days

I’d notice a big bill and move on.

Fix:

  • 3-minute daily review:
    • what consumed most tokens?
    • which prompts needed retries?
    • what can be templatized tomorrow?

My current stack (simple, not fancy)

  • TokenBar: real-time token/cost visibility in the menu bar so spend is never invisible
  • Monk Mode: feed-level blocking so deep work survives

Both are tiny tools, but together they removed the two biggest drains: invisible cost and invisible distraction.

If your AI bill is up and output is flat, don’t start with “new model.”
Start with leak hunting.

Top comments (0)