DEV Community

matt-dean-git
matt-dean-git

Posted on • Edited on • Originally published at satgate.io

What Is an Economic Firewall?

A developer we work with left a Claude Code agent running overnight on a research task. Six hours later, it had made 3,200 web search calls. The invoice: $480.

The agent wasn't malicious. It wasn't buggy. It was doing exactly what it was told — recursively searching, cross-referencing, and expanding its context. It just never had a reason to stop. Nothing in the stack told it that "enough" existed.

With SatGate in front of those same APIs, the agent would have hit its budget cap and received an HTTP 402 — Payment Required. Hard stop. No soft alert buried in a dashboard. No email that arrives three hours after the damage is done. The request is blocked, and the agent gets a clear, actionable signal: you're out of budget.

That's an economic firewall. And it's the security primitive that the entire API stack is missing.

The Problem: Authentication Without Economics

The current API security model was built for humans clicking buttons, not agents making thousands of autonomous decisions per hour. It answers one question well — "who are you?" — and ignores a second one entirely: "what can you afford?"

Consider what you have today:

  • API keys are all-or-nothing. A valid key grants full access to every endpoint it's scoped to. There's no concept of "this key has $50 left."
  • Rate limiting controls frequency, not cost. 1,000 requests per minute tells you nothing about spend. One request might cost $0.001 (a cache hit); another might cost $2.50 (a GPT-4 completion with a 32k context window). Rate limits treat them identically.
  • Usage dashboards are retrospective. They show you what already happened. By the time you see the spike, you've already paid for it.

This worked when the caller was a human developer running tests or a web app with predictable traffic patterns. It breaks catastrophically when the caller is an autonomous agent that can generate 10,000 API calls before anyone checks a dashboard.

From "Who Are You?" to "What Can You Afford?"

An economic firewall adds a financial dimension to API access control. It sits at the gateway layer — the same place you'd deploy Kong, Envoy, or any reverse proxy — and enforces budget constraints on every request that passes through it.

The security stack, evolved:

Layer Question
Network Firewall "Can this IP reach this port?"
WAF "Is this request malicious?"
API Gateway "Is this caller authenticated?"
Economic Firewall "Can this agent afford this call?"

The key shift: the credential itself carries economic constraints. Not just "you have access" but "you have $200 of access, and you've used $147.30 so far." The enforcement happens at the gateway, in real time, before the request ever reaches your backend.

Hard Caps, Not Soft Alerts

Most cost management tools send you an email when spending exceeds a threshold. That's a soft alert. Useful for humans who check email regularly. Useless for an agent loop burning $80/hour at 3am.

An economic firewall enforces hard caps. When budget reaches zero, the next request gets a 402 response. The agent can handle that gracefully — finish its current task, report partial results, request a budget increase. But it cannot continue spending.

This is the difference between a smoke alarm and a firewall. A smoke alarm tells you there's a problem. A firewall prevents the damage from spreading.

Cost Attribution at the Tool Level

Endpoint-level cost tracking isn't granular enough for AI agents. When an agent calls an MCP server, a single "endpoint" might expose dozens of tools with wildly different costs: web_search at $0.015 per call, code_execution at $0.002, dalle_generate at $0.08.

An economic firewall tracks costs at the tool level, not just the endpoint level. You can see that Agent-47 spent $12.30 on web searches, $0.80 on code execution, and $34.00 on image generation — all through the same MCP server. You can set per-tool budgets. You can block expensive tools while allowing cheap ones.

This matters because cost attribution is the foundation of cost control. If you can't see where money is going at the tool level, you can't make informed decisions about where to set limits.

Cryptographic Delegation with Macaroons

Here's a scenario that breaks traditional access control: Agent A needs to delegate a subtask to Agent B. Agent A has $500 of budget and full read-write access. It wants to give Agent B $50 and read-only access.

With API keys, you'd need to provision a new key with the right scopes, register it in your identity provider, and manage its lifecycle. With an economic firewall using macaroon-based tokens, Agent A simply attenuates its own credential:

// Agent A's token: $500 budget, read-write scope
const agentAToken = "macaroon:v1:abc...";

// Agent A attenuates for Agent B
const agentBToken = attenuate(agentAToken, {
  budget: 50,          // $50 cap (deducted from A's budget)
  scope: "read-only",  // Can't write
  tools: ["web_search", "summarize"],  // Only these tools
  expires: "2h"        // Auto-expires
});

// Agent B CANNOT:
// - Spend more than $50
// - Write anything
// - Call dalle_generate or code_execution
// - Escalate its own permissions
Enter fullscreen mode Exit fullscreen mode

The constraints are cryptographic, not policy-based. Agent B can't modify or forge the token to escalate its privileges. It can only further attenuate — passing an even more restricted token to Agent C. This creates a natural delegation hierarchy where capabilities only flow downward.

No central policy server. No admin portal. No RBAC matrix to maintain. The token is the policy.

The Three-Layer Model: Observe, Control, Charge

An economic firewall operates across three functional layers:

Observe — Audit and log every agent API call with full cost attribution. Which agent, which tool, which cost center, how much. This is the foundation — you can't control what you can't see.

Control — Enforce budgets in real time. Hard caps, per-tool limits, delegation constraints. When the budget is spent, the request is blocked.

Charge — Monetize API access via L402 payments. Agents pay per call using cryptographic payment proofs — no accounts, no invoices, no billing portals.

A critical distinction: Control and Charge are parallel use cases, not sequential stages. They both build on the Observe layer, but they serve different audiences:

  • Observe → Control is the enterprise path. You're running agents internally and need to govern their spending. Budget enforcement, cost attribution, delegation hierarchies.
  • Observe → Charge is the API monetization path. You're exposing APIs or MCP tools to external agents and want to get paid per call. L402 micropayments, usage-based billing, no signup required.

Most organizations will start with Observe (because you need visibility before you can set sensible limits), then branch into Control, Charge, or both depending on whether they're consuming or selling API access.

Why This Matters Now

The agent economy is scaling fast. Coding agents run overnight. Customer support agents handle thousands of conversations. Research agents crawl the web autonomously. Multi-agent orchestrations delegate tasks across dozens of sub-agents.

Every one of these agents is spending someone's money on API calls. And the question isn't whether runaway spend will happen — it's whether you'll catch it in real time or on the monthly invoice.

Economic governance is becoming a prerequisite for safe agentic AI deployment. Not a nice-to-have. Not a future concern. A prerequisite — the same way you wouldn't deploy a web application without authentication, or expose an API without rate limiting. The cost dimension is now a first-class security concern.


SatGate is an open-source economic firewall. Per-agent budgets, per-tool cost attribution, macaroon delegation. GitHub | Try the Sandbox

Top comments (0)