DEV Community

matt-dean-git
matt-dean-git

Posted on • Originally published at satgate.io

Hard-Capping MCP Tool Spend with SatGate Proxy

Your AI agent just burned $500 overnight calling Google Search in a loop. You found out when the bill arrived. Sound familiar?

If you're running Claude Code, Cursor, or Claude Desktop with MCP tools, you've probably had a version of this moment. Maybe it was $50, maybe $5,000. The pattern is always the same: an agent gets stuck, loops on a tool call, and your API bill explodes while you sleep.

There's a fix. And it doesn't involve monitoring dashboards, Slack alerts, or hoping you catch it in time.

The Problem: MCP Has No Credit Card Limit

The Model Context Protocol is brilliant at what it does: giving AI agents structured access to tools. Search engines, databases, code execution, image generation — MCP makes it all available through a clean JSON-RPC interface.

What MCP doesn't do is care about cost. Every tools/call request flows through to the upstream server with no budget awareness whatsoever. The spec has no concept of "you've spent too much" or "stop here."

This creates a specific, expensive failure mode:

  • Agent loops — A stuck agent can make thousands of tool calls before anyone notices. Each call costs real money.
  • No built-in limits — The MCP spec includes no budget, quota, or cost mechanism.
  • Rate limits don't help — Rate limits protect servers from overload. They don't protect your wallet from a runaway agent staying within rate limits but burning money for hours.
  • Manual monitoring is reactive — By the time you check a dashboard, the damage is done.

Real scenario: A developer left Claude Code running overnight with a web search MCP server. The agent hit a reasoning loop, called brave_search 3,200 times in 6 hours. Cost: $480 in API credits.

The Solution: Economic Governance at the Protocol Level

SatGate MCP Proxy sits between your MCP client and your MCP servers. It intercepts every tools/call, tracks cost in real-time, and enforces hard budget caps — not soft alerts, not warnings, actual enforcement.

When the budget is exhausted, the proxy returns HTTP 402 Payment Required. The agent receives a clean "Budget exceeded" message and stops.

The enforcement mechanism uses L402 macaroons — cryptographic tokens with embedded budget constraints. Unlike API keys (which grant unlimited access until revoked), a macaroon can encode: "This agent can spend up to $5 on search tools, expiring in 1 hour."

How It Works

Architecture

┌─────────────┐     ┌──────────────────┐     ┌────────────────┐
│ Claude Code  │────▶│  SatGate Proxy   │────▶│  MCP Server    │
│ Cursor       │◀────│                  │◀────│  (search, db…) │
│ Claude       │     │ ✓ Budget check   │     │                │
│ Desktop      │     │ ✓ Cost tracking  │     │                │
│              │     │ ✓ 402 on limit   │     │                │
└─────────────┘     └──────────────────┘     └────────────────┘
Enter fullscreen mode Exit fullscreen mode

Configuration

Point your MCP client to SatGate instead of directly to the upstream server:

{
  "mcpServers": {
    "search": {
      "command": "satgate-proxy",
      "args": [
        "--upstream", "npx @anthropic/mcp-server-brave-search",
        "--budget", "500",
        "--budget-window", "1h",
        "--cost-per-call", "5"
      ]
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

That's it. Your agent now has a hard cap of 500 sats per hour on search calls. No code changes. No agent modifications.

What Happens on Each Tool Call

  1. Intercept — Proxy receives the tools/call JSON-RPC request
  2. Resolve cost — Looks up the tool name in the cost table
  3. Check budget — Compares accumulated spend against the macaroon's budget caveat
  4. Forward or reject — If within budget: forward to upstream, debit cost. If over budget: return 402
// What the agent sees when budget is exhausted:
{
  "jsonrpc": "2.0",
  "error": {
    "code": -32000,
    "message": "Budget exceeded: 500/500 sats used. Reset in 23m."
  }
}
Enter fullscreen mode Exit fullscreen mode

The agent receives a clean error, stops calling the tool, and continues with other work.

Macaroon Delegation

L402 macaroons support attenuation — you can take a token and add restrictions, but never remove them:

# Create a root macaroon with $10 budget
satgate token create --budget 1000 --tools "web_search,database_query"

# Delegate to an agent: $5 budget, expires in 1 hour
satgate token attenuate <root-token> \
  --max-budget 500 \
  --expires 1h \
  --tools "web_search"
Enter fullscreen mode Exit fullscreen mode

Give your coding agent a tight budget for search. Give your research agent more for database access. Each gets exactly the permissions and budget they need — cryptographically enforced.


MCP is powerful. But power without governance is just risk. SatGate adds the economic guardrails that let you deploy agents with confidence — and a budget.

Try the sandbox — no signup required

GitHub — open source, Apache 2.0

Top comments (0)