Radosław

Posted on Mar 18

🛡️ How to Stop Your AI Agent from Sending 10,000 Emails in a Loop

#ai #opensource #typescript #programming

You ship an AI agent that can send emails. It works great in testing.

Then one night, the agent hits a retry loop. A flaky API responds slowly, the agent interprets the delay as failure, and it tries again. And again. By morning, a single user has received 847 confirmation emails. Your support inbox is on fire. Your API provider has suspended your account.

This isn't a hypothetical. It's the kind of thing that happens when you give agents real tools and don't put guardrails around how often they can use them.

In the first article, I introduced Guardio - a policy enforcement proxy that sits between your AI agent and the outside world. Today, I want to show you one of its newest built-in policies: rate limiting.

Why Rate Limiting Is Different for AI Agents

With traditional APIs, rate limiting is simple: a client sends too many requests, the server returns a 429, and the client backs off. Problem solved.

AI agents are messier.

They can retry silently without you noticing until it's too late
They don't always respect error signals the way a human-coded client would
A single agent decision (like "send a daily summary") can be triggered hundreds of times if the agent's context gets corrupted or the loop condition misbehaves
Different tools deserve different limits - spamming a read-only knowledge base is annoying; spamming a billing endpoint is catastrophic

You need rate limiting that is per-tool, deterministic, and enforced outside the agent - so the agent literally cannot exceed it, regardless of how it behaves.

That's exactly what Guardio's rate-limit-tool policy plugin does.

Quick Recap: What Is Guardio?

Guardio is a proxy you run alongside your AI agent. Every tool call your agent makes (to an MCP server, an external API, a database) passes through Guardio first. Guardio evaluates it against your configured policies, and only forwards it if it's allowed.

AI Agent → Guardio → MCP Tool / External API

No AI in the enforcement path. No prompt engineering. Just hard rules.

Setting Up Guardio

If you haven't set it up yet, one command scaffolds a full project:

npx create-guardio

You'll be prompted to choose:

A project directory name
The HTTP port Guardio will listen on (default: 3939)
A storage backend (SQLite is the easiest to start with)
Whether to install the dashboard UI

Once scaffolded:

cd guardio-project
npm install
npm run guardio

Then point your AI agent or MCP client at http://127.0.0.1:3939 instead of directly at your tools.

Your config lives in guardio.config.ts. Here's a minimal example with an MCP tool connected:

// guardio.config.ts
import type { GuardioConfig } from "@guardiojs/guardio";

const config: GuardioConfig = {
  client: {
    port: 3939,
  },
  servers: [
    {
      name: "email-tool",
      type: "url",
      url: "https://your-mcp-email-server.com/sse",
    },
  ],
  plugins: [
    {
      type: "storage",
      name: "sqlite",
      config: { database: "guardio.sqlite" },
    },
  ],
};

export default config;

Your agent connects to http://127.0.0.1:3939/email-tool/sse - Guardio is now in the middle.

Introducing `rate-limit-tool`

The rate-limit-tool policy plugin enforces a maximum number of calls to any given tool within a fixed time window. It's a built-in plugin shipped with Guardio - no extra installation needed.

The configuration is intentionally simple:

Field	Type	Description
`limit`	number	Maximum calls allowed in the window
`windowSeconds`	number	Duration of the time window, in seconds

For example: limit: 5, windowSeconds: 60 means no more than 5 calls per minute.

How It Works Under the Hood

The plugin uses fixed time windows - it doesn't slide. If your window is 60 seconds, windows are 0:00–1:00, 1:00–2:00, etc. Simple and predictable.

State (current count and window start) is stored in the PluginRepository - meaning it persists across requests and survives restarts if you're using SQLite or PostgreSQL. If no storage is configured, the plugin fails open (allows all calls) and logs a warning. This is a deliberate design choice: Guardio doesn't silently break your agent in misconfigured environments.

When the limit is exceeded, the agent receives a structured block response - not a raw error, but a clean JSON-RPC success result with human-readable reason:

Rate limit exceeded: 5/5 calls in 60s window. Resets at 2025-03-18T12:01:00.000Z.

The agent frameworks won't choke on this. They'll get a clear message they can surface or log.

Configuring the Policy via the Dashboard

If you installed the Guardio dashboard, configuring rate limits is point-and-click.

Open the dashboard (npm run dashboard)
Navigate to Policies
Create a new policy, select rate-limit-tool
Fill in limit and windowSeconds
Assign it to the tool(s) you want to protect

You can create multiple instances of the policy with different limits - for example, a strict limit on your email tool and a more generous one on a read-only search tool.

Configuring the Policy in Code

If you prefer to manage things programmatically, you can wire up the plugin directly. Here's the full implementation for reference - this is exactly what's shipping in Guardio:

import { z } from "zod";
import type {
  PolicyPluginInterface,
  PolicyRequestContext,
  PolicyResult,
  PluginRepository,
} from "@guardiojs/guardio";

const rateLimitToolConfigSchema = z.object({
  limit: z.number().int().min(1),
  windowSeconds: z.number().int().min(1),
});

class RateLimitToolPolicyPlugin implements PolicyPluginInterface {
  readonly name = "rate-limit-tool";

  constructor(
    private readonly limit: number,
    private readonly windowSeconds: number,
    private readonly repo?: PluginRepository,
  ) {}

  async evaluate(context: PolicyRequestContext): Promise<PolicyResult> {
    if (!this.repo) return { verdict: "allow" };

    const windowMs = this.windowSeconds * 1000;
    const now = Date.now();
    const currentWindowStart = Math.floor(now / windowMs);
    const contextKey = `ratelimit:${context.toolName}`;

    const doc = await this.repo.getDocument(contextKey);
    const stored = doc?.data as { windowStart: number; count: number } | undefined;

    const isNewWindow = (stored?.windowStart ?? 0) !== currentWindowStart;
    const currentCount = isNewWindow ? 0 : (stored?.count ?? 0);

    const resetsAt = new Date((currentWindowStart + 1) * windowMs).toISOString();

    if (currentCount >= this.limit) {
      return {
        verdict: "block",
        code: "RATE_LIMIT_EXCEEDED",
        reason: `Rate limit exceeded: ${currentCount}/${this.limit} calls in ${this.windowSeconds}s window. Resets at ${resetsAt}.`,
        metadata: { currentCount, limit: this.limit, windowSeconds: this.windowSeconds, resetsAt },
      };
    }

    await this.repo.saveDocument(contextKey, {
      windowStart: currentWindowStart,
      count: currentCount + 1,
    }, doc?.id);

    return { verdict: "allow" };
  }
}

A few things worth noticing here:

Per-tool keying: the storage key is ratelimit:{toolName}, so each tool gets its own independent counter. Exceeding the limit on send_email doesn't affect search_docs.
Atomic-ish updates: the plugin reads the current count, increments, and saves in sequence. For very high-concurrency scenarios you'd want to pair this with a more robust store, but for typical agent workloads this is more than sufficient.
Clean metadata: the PolicyResult carries currentCount, limit, and resetsAt in metadata - so your event sink and dashboard can surface real usage data, not just "blocked".

A Practical Example: Protecting an Email Tool

Say your agent has access to a send_email MCP tool. You want to allow it to send at most 10 emails per hour - enough for normal operation, but a hard cap against runaway loops.

Set up Guardio with:

limit: 10
windowSeconds: 3600

Assign this policy to the send_email tool in the dashboard (or via config).

Now, when the agent calls send_email for the 11th time in the same hour, it gets back:

{
  "isError": true,
  "content": [
    {
      "type": "text",
      "text": "Rate limit exceeded: 10/10 calls in 3600s window. Resets at 2025-03-18T13:00:00.000Z."
    }
  ],
  "_guardio": {
    "action": "BLOCKED",
    "policyId": "rate-limit-tool",
    "code": "RATE_LIMIT_EXCEEDED"
  }
}

The email is never sent. The upstream server never sees the request. And in your dashboard, you have a full audit trail of every allowed and blocked call.

Stacking Policies

Rate limiting doesn't have to stand alone. Guardio evaluates policies as a chain - if any returns block, the call is stopped. This means you can combine rate-limit-tool with other policies:

deny-regex-parameter - block calls where an argument matches a pattern (e.g. block emails to *@competitor.com)
deny-tool-access - block the tool entirely for specific agents
Your own custom policy plugin - any TypeScript class that implements PolicyPluginInterface

A real setup might look like: rate limit the email tool to 10/hour, AND block any call where the recipient matches a known bad domain. Both policies apply. Either one can stop the call.

Try It

npx create-guardio

🔗 GitHub: https://github.com/radoslaw-sz/guardio

If this solves a problem you've been staring at, a ⭐ on GitHub goes a long way. And if you have a policy use case you'd like to see built in - open an issue.

DEV Community

🛡️ How to Stop Your AI Agent from Sending 10,000 Emails in a Loop

Why Rate Limiting Is Different for AI Agents

Quick Recap: What Is Guardio?

Setting Up Guardio

Introducing `rate-limit-tool`

How It Works Under the Hood

Configuring the Policy via the Dashboard

Configuring the Policy in Code

A Practical Example: Protecting an Email Tool

Stacking Policies

Try It

Top comments (0)

Why Rate Limiting Is Different for AI Agents

Quick Recap: What Is Guardio?

Setting Up Guardio

Introducing rate-limit-tool

How It Works Under the Hood

Configuring the Policy via the Dashboard

Configuring the Policy in Code

A Practical Example: Protecting an Email Tool

Stacking Policies

Try It

Introducing `rate-limit-tool`