Why your bulk DELETE on Supabase is wrong before you run it (count drift in 30 min)

#supabase #postgres #claudecode #webdev

The count that doesn't survive a coffee

May 16th, Supabase console open on a batch of orphan rows. Catherine flagged three duplicate attendance entries the day before with her usual line, "hum, it bugs, but it's quickly fixed." A SQL probe later, it isn't three cases — it's the whole class. I run SELECT COUNT(*) FROM seances WHERE cours_id NOT IN (SELECT id FROM cours) at 10:47. The raw output says 351. I draft the DELETE, re-read the WHERE, walk to the coffee machine.

Ten minutes later, hand on the keyboard, a residual reflex makes me re-run the COUNT. It now says 381. Thirty rows appeared between my two probes. A cron job ran once while I was queuing for coffee and re-injected its rows. The DELETE I was about to launch would have wiped a perimeter that wasn't the one I had scoped ten minutes earlier. No alarm, no error. Just a thirty-row drift nothing would have shown me if habit hadn't made me re-probe.

Two probes, two distinct functions

The natural reflex is to call count the act of returning an integer. That lexical confusion is the trap. The first probe scopes the work — "here's the order of magnitude, here's the incident class." It informs a human framing decision. Half an hour can pass between this probe and the final call, and that's healthy: a rushed audit gets the class wrong.

The second probe has a structurally different function. It authorizes. It says "the perimeter you're about to touch is the one you scoped." Its invalidation window isn't measured in human reasoning time but in cron cycles. On a production Postgres with crons running every five minutes, async webhooks, an overnight sync, two or three assistants writing in parallel, the perimeter has the lifespan of an interval between two cron runs — minutes, not hours.

Every dev knows this intellectually. None of us operate it. A best practice without material enforcement evaporates exactly when it would be useful — I hit the same drift three times in ten days, and I'm not a beginner.

R7 amended, and the hook that closes the gap

Counterpart Toolkit v0.7, published May 20th, adds one paragraph to R7:

For bulk DELETE/UPDATE on a live system: re-run the count query immediately before the mutation. Abort if delta > 5% from the initial probe — counts older than ~30 minutes are stale in active systems. The first probe scopes the work; the second probe is the gate.

Thirty minutes is the empirical floor below which a probe almost certainly crosses one cron cycle on our stack. Five percent is the threshold above which the unexpected surface dwarfs the cost of re-scoping.

R7 lives in CLAUDE.md. Read at session start, forgotten when it matters. The fix is a PreToolUse hook that scans every mcp__supabase__execute_sql payload for unkeyed DELETE and UPDATE patterns. To pass, the SQL must carry an explicit marker:

-- count-fresh:20260516-1057
DELETE FROM seances
WHERE cours_id NOT IN (SELECT id FROM cours);

The hook parses the timestamp and blocks if it exceeds 30 minutes. No marker, no execution. The bypass isn't a backdoor; it's a nominative declaration that the operator re-ran the probe. Discipline isn't in memory — it's in the friction between operator and mutation.

Three times in ten days I almost destroyed a perimeter I thought I knew. The fourth time, the hook refused execution — my marker was 47 minutes old. I re-probed: twelve rows had moved. Twenty seconds of cost, a redo I wouldn't have seen coming until the next morning.

Counterpart Toolkit v0.7, R7 amended on N=2 incidents. Hook ~/.claude/hooks/pre-bulk-mutation-count-staleness.sh. CC-BY-4.0: github.com/michelfaure/doctrine-counterpart