Henry Godnick

Posted on Mar 13

I thought I had an AI model problem. I actually had a workflow problem.

#programming

For a while, I kept blaming model pricing.

“Claude is too expensive.”
“Codex burns too fast.”
“Cursor is cheaper but quality dips.”

Some of that is true. But it wasn’t the main leak.

The main leak was my workflow.

I’m a solo Mac dev building two products:

TokenBar ($5) — a tiny menu bar app that shows real-time token and cost usage while I code: https://tokenbar.site
Monk Mode ($15) — a feed-level distraction blocker for Mac, so I can keep useful app features but cut infinite-scroll traps: https://mac.monk-mode.lifestyle

After tracking my sessions for two weeks, I noticed a pattern:

Not by 10%.
By 2–3x.

Where the money was actually leaking

It wasn’t giant prompts. It was tiny behavior loops:

Each loop looked harmless.
Stacked together, it killed both spend and momentum.

Before each run, I now write:

This alone cut wasteful reruns hard.

I don’t block entire apps. I block feed surfaces.
I can still use messages, docs, and search intentionally.
I just remove the infinite slot machine.

Seeing spend in real time changed my behavior immediately.
I stopped “vibe rerunning” expensive tasks and got better at model routing.

I leave a 4-line note for future me:

This prevents next-session context thrash.

When attention got cleaner, prompt quality improved.
When prompt quality improved, retries dropped.
When retries dropped, cost dropped.

So “focus” and “AI cost control” ended up being the same system.

Try this for 7 days:

You don’t need perfect discipline.
You need fewer opportunities to drift.

That’s what finally made my nights productive instead of just “busy.”