DEV Community

Joe Carpenter
Joe Carpenter

Posted on

How an AI Agent Ran Up a $47,000 Bill in 11 Days (And How to Stop It)

How an AI Agent Ran Up a $47,000 Bill in 11 Days (And How to Stop It)

Published by Innovative Systems Global — April 2026


In November 2025, four AI agents entered an infinite retry loop.

Nobody noticed for 11 days.

When the bill arrived, it was $47,000. All of it from LLM API calls. All of it preventable. The team had logging. They had monitoring. They did not have a hard limit.

This is not a unique incident. It's becoming a rite of passage for engineering teams running agents in production.


Why this keeps happening

Every major LLM provider — OpenAI, Anthropic, Google — charges per token. The more your agent runs, the more you pay. This is the correct model. The problem is that agents don't know how much they're spending, and nothing stops them when they exceed a budget.

Current "solutions":

  • Spend alerts — fire after the damage is done. An alert at $1,000 doesn't help when an agent burns $4,700 per day.
  • API rate limits — these throttle requests per minute, not total spend.
  • Observability platforms (Helicone, LangSmith) — they show you what happened. They don't prevent it.
  • Cloud billing alerts — by the time AWS or OpenAI sends an alert, the loop has been running for days.

What's missing: a hard gate that runs before the LLM call, checks the budget, and refuses to proceed if the limit is exceeded.


The two-line problem

Here's what most agent code looks like:

response = client.chat.completions.create(
    model="gpt-4o",
    messages=messages
)
Enter fullscreen mode Exit fullscreen mode

There is no cost tracking here. No budget check. No receipt. If this code runs 50,000 times in an infinite loop, you find out when the bill arrives.


The fix: meter every call, enforce every limit

We built dingdawg-governance to solve this. Three new MCP tools in v2.1.0:

meter_llm_call — call this after every LLM response. Pass the model, tokens in, tokens out, and your agent ID. Get back the cost, your cumulative spend, and your budget status.

{
  "receipt_id": "mtr_abc123_def456",
  "agent_id": "my-research-agent",
  "provider": "openai",
  "model": "gpt-4o",
  "prompt_tokens": 1200,
  "completion_tokens": 800,
  "cost_usd": 0.018,
  "cumulative_spend_usd": 12.43,
  "budget_status": "ok",
  "budget_limit_usd": 50.00
}
Enter fullscreen mode Exit fullscreen mode

set_llm_budget — set a hard limit for any agent. Daily or monthly. Warning fires at 80% by default.

{
  "agent_id": "my-research-agent",
  "limit_usd": 50.00,
  "period": "daily"
}
Enter fullscreen mode Exit fullscreen mode

get_spend_report — query spend by agent, model, and date range. See exactly which agents cost what.


How the $47K incident gets prevented

With dingdawg-governance wired:

  • Day 1: Agent starts loop. meter_llm_call tracks each call.
  • Day 1, ~$40 in: budget_status flips to "warning". Your code can log, alert, or throttle.
  • Day 1, $50 in: budget_status flips to "exceeded". Your code stops the agent.
  • Total damage: $50, not $47,000.

The enforcement is in YOUR code — you decide what to do when the budget is exceeded. The meter gives you the signal.


Installation

# As an MCP server (Claude Desktop, Cursor, any MCP-compatible client)
npx dingdawg-governance

# Claude Code
claude mcp add dingdawg-governance npx dingdawg-governance
Enter fullscreen mode Exit fullscreen mode

Free tier: unlimited meter_llm_call and set_llm_budget calls. Local filesystem storage. No API key required.

Paid tier ($19/month): cloud receipt storage, team dashboards, cross-session spend history, PDF export. API key at dingdawg.com/developers.


Price table

Built in. Covers 30+ models across OpenAI, Anthropic, Google, Groq, Mistral, Cohere, and DeepSeek. Updated with each release.

If your model isn't in the table, it returns cost_usd: 0 with a note — it never silently miscalculates.


Works with any agent framework

dingdawg-governance is an MCP server. Any agent that can call MCP tools can use it — LangChain, AutoGen, CrewAI, custom agents, Claude Code, Cursor. No SDK required. No framework lock-in.


The broader problem

The $47K incident is the visible symptom. The real problem is that enterprises are deploying agents with no spend governance at all. Every dollar an agent spends is invisible until it's gone.

As agents become more autonomous — running overnight, chaining into other agents, operating without human supervision — the spend problem compounds. A single misconfigured retry policy can turn a $50 research job into a $50,000 infrastructure incident.

Budget enforcement isn't a nice-to-have. It's the seatbelt.


Get started

npx dingdawg-governance
Enter fullscreen mode Exit fullscreen mode

Source: github.com/dingdawg/governance-sdk
Pricing: dingdawg.com/developers


Innovative Systems Global builds AI governance infrastructure for teams running agents in production. Based in the Rio Grande Valley, Texas.

Top comments (0)