DEV Community

Henry Godnick
Henry Godnick

Posted on

How I cut my AI token bill and doomscroll time in the same week (solo Mac dev playbook)

If you’re a solo builder using Claude/Codex/Cursor all day, you probably know this feeling:

  • you shipped “a lot”
  • your AI bill still hurt
  • your screen-time report looked embarrassing

That was me.

I kept treating AI cost and focus as separate problems. They’re not.
They feed each other.

When I’m distracted, I write sloppier prompts.
Sloppier prompts create more retries.
More retries = token burn + time burn.

Here’s the exact system I now run.

1) Track spend at session level (not month-end panic)

I used to check billing dashboards too late.
Now I track spend while I work, session by session.

I built TokenBar (tokenbar.site) for this:

  • live token + cost visibility in the Mac menu bar
  • immediate feedback when a prompt pattern is wasteful
  • no end-of-month surprise

The behavior change was instant:
If a loop starts burning, I catch it in minutes, not days.

2) Run short “prompt quality sprints”

Before opening an IDE, I do one 10-minute pass:

  • define success output in 3 bullet points
  • define constraints (latency, format, edge cases)
  • define stop condition (what counts as done)

This cut retries more than any model switch I tested.

3) Block feeds, not the whole internet

I don’t need a prison laptop. I need fewer trapdoors.

I built Monk Mode (mac.monk-mode.lifestyle) for this:

  • feed-level blocking on Mac (YouTube feed, X timeline, etc.)
  • still lets me use tools I actually need
  • no context-switch vortex when I hit a hard bug

My rule: if I’m stuck, I can research docs, but no algorithmic feeds.

4) Use a fallback protocol for outages

When Claude/Codex has issues, I used to doomscroll while “waiting.”
Now I switch to:

  • spec cleanup
  • test writing
  • docs and commit hygiene

This keeps momentum even when model access is flaky.

5) Cap retries per task

Hard rule: max 3 retries before changing approach.

If retry #3 fails, I must do one of:

  • simplify the task
  • reduce context
  • change model/tool
  • park and return later

No more infinite looping because “one more try might work.”

6) Measure cost per shipped outcome

Not “cost per day.”
Not “tokens consumed.”

Metric that matters:
Cost per useful shipped task.

This prevents over-optimizing raw token numbers while under-delivering product.

7) Timebox deep work with no-feed windows

I run 90-minute blocks.
During block:

  • feed-level blocks ON
  • cost visibility ON
  • one task only

It’s boring. It works.

8) Weekly review (20 mins)

Every week I review:

  • top 3 expensive sessions
  • what caused waste (retry loops, vague prompts, giant context)
  • one rule update for next week

Small rule changes compound quickly.


The key lesson

If your attention is leaking, your AI budget leaks too.

Treat focus and spend as one system.
That shift alone gave me cleaner shipping days, lower token burn, and way less late-night doomscroll regret.

If you’re building with AI all day, fix both together.

Top comments (0)