Discussion on: I Let My AI Agent Run Overnight. It Cost $437.

View post

$437 overnight is the canonical horror story and almost always the same root cause: no hard ceiling. An agent left running with retries and a growing context will happily loop on a problem it can't solve, re-sending the whole transcript every step, and you wake up to a bill that's 95% wasted spinning. The scary part isn't the dollar figure, it's that nothing stopped it - it would've hit $4,370 if you'd slept in.

The fix is boring but absolute: a hard per-run token/cost ceiling that aborts (not alerts), a max-steps cap, and detection for "no progress in N steps -> kill." Alerts tell you it happened; only a hard abort prevents it. This is exactly why Moonshift (a multi-agent pipeline that ships a prompt to a deployed SaaS) runs with per-agent + per-run hard caps and scoped context - a full build lands ~$3 flat because a runaway literally can't run away, it gets killed. Painful but genuinely useful post - did you have any ceiling at all, or was it alerts-only? And was the $437 one stuck loop, or death-by-a-thousand-calls? The shape of the waste usually points straight at the missing guardrail.