Henry Godnick

Posted on Mar 11

9 tiny systems I use to stop AI token bleed and doomscroll drift as a solo Mac dev

#macos

Last month I had two different leaks in my workflow:

my AI spend kept creeping up from "just one more run"
my focus kept getting wrecked by infinite feeds

I kept treating them as separate problems.
They weren’t.

Both were attention leaks.
Both were small defaults compounding.

Here are 9 tiny systems that helped me stabilize both.

1) Set a hard spend cap before every serious coding session

Before I start, I set a cap for the session (example: "$8 max today").

When I don’t do this, I always rationalize one extra call, one extra retry, one extra context blast.

A cap makes decisions easier: refine prompt vs. brute-force more tokens.

2) Track cost in real time (not after the damage)

End-of-day invoices are too late.

I keep a live token/cost meter visible in the menu bar so I can see spikes while I’m working.
I built this into TokenBar because I needed it myself: tokenbar.site ($5 one-time).

The biggest win is behavioral: when I see burn rate, I naturally tighten prompts.

3) Use a 3-line brief for every agent task

My brief format:

Done means: exact output shape
Constraints: what not to do
Stop rule: when to halt instead of looping

This dropped my “almost right, rerun again” loop dramatically.

4) Add a retry budget

I use a simple rule:

first miss: tighten instruction
second miss: reduce scope
third miss: stop and redesign

Without this, it’s easy to keep paying for emotionally-driven retries.

5) Block feeds at the source during deep work

Timers alone weren’t enough for me.

If a feed opens, my brain takes the bait.
So I started blocking feed surfaces during build windows with Monk Mode (mac.monk-mode.lifestyle, $15 one-time).

Not all social apps, just the parts that trigger scroll spirals.

6) Pair block windows with shipping windows

I run 45–90 minute “ship blocks.”
During these blocks:

feed surfaces blocked
notifications minimized
one repo, one outcome

The key is pairing restriction with a clear output target (PR, feature chunk, doc update).

7) Keep a “context payload” file ready

When tools crash, reset, or rate-limit, I used to rebuild context manually and waste time + tokens.

Now I keep one fresh context file with:

current objective
architecture notes
known constraints
open questions

Recovery gets much cheaper.

8) Track failure patterns, not just success output

I log expensive failures in one note:

what prompted the detour
what signal I ignored
what I’ll do differently next run

Most of my cost improvements came from avoiding repeated mistakes, not better model settings.

9) Measure “focus debt” the same way as cloud debt

If I lose 40 minutes to doomscrolling, I count it like an actual bill.

That framing changed everything.
Time lost to feeds is often more expensive than token spend.

The simple stack I use now

Real-time AI spend visibility: TokenBar
Feed-level focus protection: Monk Mode
Tiny operating rules: caps, retry budget, ship blocks

Nothing here is fancy.
But together, these defaults stopped both leaks enough that I can ship more with less stress.

If you’re solo-building with AI all day, I’d start with just two changes this week:

1) a hard daily spend cap
2) feed blocking during one 60-minute build block

Do that for 7 days and your numbers (and sanity) should look different.

DEV Community