AI Agent Cost Attribution: How to Know Which Agent Is Burning Your Budget

#ai #devops #cloud #agentdev

The CFO calls. Your AI infrastructure bill doubled last month. Which agent did it?

If you cannot answer that in 30 seconds, you have a cost attribution problem.

Why AI Agent Cost Is Hard to Track

Shared model endpoints. Multiple agents hit the same OpenAI or Anthropic API. The bill is one line item. Which agent made which call?

Cascading tool use. An agent calls a tool, which triggers another API call, which generates another LLM call. Cost cascades across systems with no parent reference.

Runaway behaviour. An agent in a loop hitting an API 10,000 times in an hour will not be obvious in aggregate dashboards until the invoice arrives.

The Right Architecture

Every agent action needs to carry identity metadata:

{
  "agentId": "customer-support-v2",
  "teamId": "customer-ops",
  "costCentre": "CC-2041",
  "tool": "openai_completion",
  "model": "gpt-4o",
  "tokensIn": 1240,
  "tokensOut": 89,
  "estimatedCost": 0.0043,
  "timestamp": "2026-03-01T14:23:01.847Z"
}

With this you can answer: which agent costs most, which team is responsible, which tools drive cost, whether cost is growing unexpectedly.

Rate Limiting: Catching Runaway Agents Before the Bill Arrives

rules:
  - id: token-budget-daily
    action: block
    match:
      agent: "*"
    rateLimit:
      metric: estimated_cost_usd
      limit: 50.00
      window: 86400     # $50/day per agent hard cap

  - id: loop-detection
    action: block
    match:
      tool: "*"
    rateLimit:
      limit: 50
      window: 60        # 50 tool calls in 60 seconds = likely a loop

When the rate limit triggers, the agent halts and you get an alert. The runaway $50,000 bill does not materialise.

The CFO Conversation

Before cost attribution:
"Our AI costs doubled. We think it was one of the support agents but we are not sure which one. We are looking into it."

After cost attribution:
"Agent customer-support-v2 in the APAC team ran 4x normal volume on March 1st due to a promotion campaign. Here is the breakdown by tool type, and here is the rate limit we have now set."

The second conversation takes 30 seconds to prepare.

AgentGuard includes per-agent cost attribution, real-time spend dashboards, and rate limiting. Free tier available.