Originally published at devtoolpicks.com
A developer posted yesterday: "I accidentally burned ~$6,000 of Claude usage overnight with one command." Almost everyone in the thread had a similar story.
This is becoming a pattern. In March, a Claude Max subscriber asked an Anthropic agent how to schedule overnight Claude Code runs and got hit with $1,800 in API charges in two days. Earlier this year, a LangChain agent got stuck in a loop and ran 14,000 redundant tool calls before hitting a $437 charge. A developer with a "mathematically proven convergence system" paid 100 EUR for what turned into 124 copies of "You've hit your limit" and 2 actual test fixes.
If you're building with AI agents in 2026, your bill is one bug away from a five-figure mistake. Here's how it happens, and how to stop it before your card gets hit.
Why This Keeps Happening
The marketing pitch for AI agents is "set it and forget it." Run an agent overnight, wake up to fixed bugs, merged PRs, and shipped features. The reality is that when an agent gets stuck, it doesn't crash. It loops. And every loop iteration costs money.
Three patterns drive almost all runaway billing incidents.
Hook recursion. Claude Code lets you wrap tool calls with hook scripts. If a hook script triggers another tool call, and that tool call triggers the same hook, you have an infinite loop. Anthropic's own post-mortem from April 28 documented this exact bug: "hook chain recursed without timeout or depth limit, agent hung indefinitely past wall-clock budget." There's no built-in timeout enforcement and no recursion-depth tracking. The agent just keeps spending until something else stops it.
Retry storms. When Claude Code hits a rate limit, it returns "You've hit your limit" as normal output with exit code 0. To an automation script watching for failures, that looks like a failure response. So the script retries. Each retry costs tokens. One developer reported 96% of their paid attempts were rate limit errors, not actual work.
Subagent fan-out. Claude Code's Agent Teams feature lets one task spawn parallel sub-agents. Each sub-agent runs its own context window. A 3-agent session for an hour can consume what a single-agent session burns in a full day. If your orchestrator agent spawns sub-agents that spawn more sub-agents, the cost compounds exponentially.
The common thread: AI agents fail expensively rather than loudly. They keep working, billing accumulates silently, and you don't notice until the invoice arrives.
What Actually Caused the $1,800 Incident
The GitHub issue is worth reading in full. Here's what happened:
A developer on the Claude Max 20x plan ($200/month) asked the Anthropic-built claude-code-guide agent how to schedule overnight Claude Code runs during a 2x usage promotion. The agent recommended using claude -p with an ANTHROPIC_API_KEY environment variable.
The developer set up cron jobs running claude -p --dangerously-skip-permissions overnight. What they didn't realize: claude -p with an API key bypasses your subscription entirely and bills directly to your API account at standard token rates. Their scripts ran Opus 4.6 in agentic loops, roughly 47K input tokens per request, dozens of requests per hour.
After two nights, the bill was over $1,800. The Max subscription they were trying to leverage hadn't covered any of it.
The developer asked Anthropic to fix this in their own tooling. Specifically: warn when claude -p is run with an API key set, and update the guide agent to flag this risk to Max subscribers. As of writing, the issue is still open.
How to Set Hard Spending Caps
Anthropic's API doesn't have a true "stop spending at $X" hard cap. It has soft alerts and budget thresholds, but it does not auto-disable your account. You need to set caps yourself.
API workspace spend limits. If you're on direct API billing, log into the Anthropic Console and set a workspace-level spend limit. Go to Settings → Billing → Workspace Limits and set both a daily and monthly cap. If you exceed it, your API key returns an error instead of running. This is the single most important thing to do today.
Disable auto-reload. Auto-reload is Anthropic's feature that adds credits when your balance runs low. It's also how a $50 surprise becomes a $5,000 surprise. Disable it: Console → Billing → Auto-reload → Off. Better to hit a hard wall than a soft one.
Block API credit fallback in Claude Code. When Claude Code hits your subscription limit, it can prompt you to use API credits (billed at standard rates). To prevent this entirely, log out and log back in using only your subscription credentials:
claude logout
claude login
Authenticate using your Pro or Max plan credentials only. Don't add Claude Console credentials. This ensures Claude Code never falls back to pay-per-token billing.
How to Cap Agent Loops
For Claude Code agents specifically, you can configure both turn limits and budget limits in your settings.
Add this to ~/.claude/settings.json:
{
"agent": {
"max_turns": 50,
"max_budget_usd": 5.00
},
"hook_timeout_seconds": 30
}
max_turns caps the number of tool-use iterations. 50 is generous for most tasks. If your agent hits this, it stops and returns what it has.
max_budget_usd is a per-session spending cap. The agent stops when it crosses the threshold. Set this to a number you'd be comfortable losing if something goes wrong. $5 is a reasonable default for non-trivial work.
hook_timeout_seconds prevents hook recursion from running forever. 30 seconds is enough for any reasonable hook script.
For agents you build with the Claude SDK directly, set max_turns and max_budget_usd on every agent loop:
from anthropic import Anthropic
agent = Anthropic().messages.create(
model="claude-opus-4-7",
max_turns=20,
max_budget_usd=2.00,
# ... rest of your config
)
This is non-negotiable for any agent running in CI/CD, on a cron job, or in any unattended context. An uncapped headless agent triggered on every commit is an open tab on your credit card.
How to Monitor Spend in Real Time
If you can see what's happening, you can stop it before it gets bad.
Use /cost in Claude Code sessions. Run /cost in any active Claude Code session to see token count and estimated spend for that conversation. Run it whenever the session feels slow or repetitive. If the number is climbing without progress, something's wrong.
Install ccusage. This is a community tool that parses Claude Code's local logs and shows daily and monthly token consumption. It's read-only and runs locally, so there's no privacy concern. Install with npm install -g ccusage.
Watch for the danger signs. Claude Code stalling at 95% completion. Repeated "let me write the document" without actual writes. Tool calls returning the same output multiple times. These are loop signatures. Hit Escape and start fresh. The few minutes you save by waiting it out cost more than the few minutes you spend restarting.
What to Do If It Already Happened
If you're reading this after a bill shock, here's the practical path forward.
File a credit request. Anthropic does grant credits for clear billing incidents, especially when there's a documented bug. Go to support.anthropic.com and file a ticket. Include the date range, the exact dollar amount, and a description of what went wrong. If you have logs showing the agent was looping, attach them.
Reference similar incidents. The GitHub issues for Claude Code billing bugs are public. If your situation matches a documented bug, link to it. Anthropic has more leverage to refund when there's a clear product issue.
Cancel auto-reload immediately. Even if you're not getting a refund, stop the bleeding. Cancel auto-reload, set workspace caps, and rotate your API key if you suspect it's being used in a script you forgot about.
Switch to subscription if you can. For most indie hackers, Claude Max at $200/month is dramatically cheaper than API billing. One developer reported using 10 billion tokens over eight months. At API rates, that's over $15,000. On Max, it cost $800 total. The Max plan saved 93%. If you're using Claude Code more than a few hours a day, get on Max and stop touching the API directly.
The Broader Pattern
This isn't an Anthropic-specific problem. It's an agent-architecture problem. Cursor went through a similar reckoning in June 2025 when they replaced fixed allotments with usage-based credit pools. Developers reported hundreds of dollars in overages in single weeks. GitHub Copilot is moving to usage-based billing on June 1, 2026, which will likely surface the same issues.
If you're building or using AI agents in 2026, treat token budgets the way you treat database connections. Pool them. Cap them. Monitor them. Anything else is leaving the lights on with the door open.
The flip side: this is also why our Claude Code commit detection post hit so hard yesterday. Anthropic is willing to silently route your billing based on what's in your git history. If you're already paying attention to billing surprises, the commit-scanning story makes sense as part of the same pattern.
For comparisons of Claude Code against other AI coding tools, see Cursor vs Windsurf vs Zed for indie hackers. And for the specific question of whether to switch to Codex, see Codex vs Claude Code.
FAQ
Will Anthropic refund a runaway billing incident?
Sometimes. They've granted credits in documented bug cases, especially when the issue is reproducible and there's a GitHub issue tracking it. The success rate seems higher when you can show the agent was looping due to a Claude Code bug rather than your own script error. File a ticket at support.anthropic.com with details and logs.
Why doesn't Anthropic add a hard spend cap?
Their position is that auto-reload exists for users who want uninterrupted service, and budget alerts let you opt out. The reality is that for unattended agents, alerts don't help if the agent runs at 3 AM. Multiple GitHub issues have requested true hard caps. As of writing, none have been implemented.
What's the safest way to run Claude Code overnight?
Use your subscription, not API billing. Set max_turns and max_budget_usd in ~/.claude/settings.json. Disable auto-reload. Add hook timeouts. Run a small test job first and check /cost afterward to confirm the consumption matches your expectation. If a small test surprises you, a long run will too.
Does this affect Cursor or Codex too?
Yes. Any AI coding tool that lets you run agents in loops has the same risk profile. Cursor's pricing has overage exposure. Codex bills per token on the API. The specific commands and settings differ, but the underlying problem (agent loops billing accumulating silently) is the same. Set caps on every tool, not just Claude Code.
Should I just stop using AI agents?
No. The productivity gains are real for the right tasks. But treat them like any other piece of automation that touches a credit card. You wouldn't run an unbounded loop against the Stripe API. Don't run one against Anthropic either.
Top comments (0)