MrClaw207

Posted on Jun 16

The MCP Gateway Pattern: Managing 13,000+ Servers Without Losing Your Mind

#ai #llm #agents #llmtools

You've got 47 MCP servers running. Your agent needs the GitHub one, the Postgres one, and the Slack one — but it also has access to 44 others it absolutely does not need for this task. You shipped this to production last week. Someone asked it to "check our repos" and it got curious.

The MCP ecosystem hit 13,000 registered servers and 97 million monthly downloads. That's a lot of capability. It's also a lot of attack surface.

This isn't a hypothetical. Every serious AI agent deployment I've seen either ignores the problem (everyone gets everything) or hardcodes a allowlist at startup (rigid, breaks on server updates). There has to be a better way.

The Gateway Pattern

A dedicated MCP gateway sits between your agent and the server ecosystem. It handles:

Server discovery and registration — what servers exist, what tools do they expose?
Request routing — which servers is this particular agent/task allowed to call?
Audit logging — what did each agent call, when, and what came back?

Think of it like a reverse proxy for your AI's tool access. The agent talks to the gateway. The gateway talks to the actual servers.

Here's a minimal gateway in Python using the official MCP SDK:

from mcp.server import Server
from mcp.types import Tool, CallToolResult
import mcp.server.stdio

server = Server("mcp-gateway")

# Registry: which servers this gateway knows about
SERVER_REGISTRY = {
    "github": {"url": "https://mcp.github.com", "tools": ["create_issue", "list_repos"]},
    "postgres": {"url": "https://mcp.postgres.com", "tools": ["query", "schema"]},
    "slack": {"url": "https://mcp.slack.com", "tools": ["post_message", "list_channels"]},
}

# Policy: which servers are allowed for which agents
AGENT_POLICIES = {
    "code-review-agent": ["github"],
    "data-agent": ["postgres"],
    "comms-agent": ["slack"],
}

@server.list_tools()
async def list_tools(agent_id: str) -> list[Tool]:
    allowed = AGENT_POLICIES.get(agent_id, [])
    return [
        Tool(name=f"{srv}:{t}", description=f"{srv}::{t}")
        for srv in allowed
        for t in SERVER_REGISTRY[srv]["tools"]
    ]

The key insight: the agent never sees tools it isn't allowed to call. The gateway filters at the protocol level, not the prompt level.

Routing Requests

When the agent calls a tool, the gateway extracts the namespace (github:create_issue) and routes accordingly:

@server.call_tool()
async def call_tool(agent_id: str, name: str, arguments: dict) -> CallToolResult:
    allowed = AGENT_POLICIES.get(agent_id, [])

    if ":" not in name or name.split(":")[0] not in allowed:
        raise ValueError(f"Agent {agent_id} not authorized for {name}")

    server_name, tool_name = name.split(":", 1)
    target = SERVER_REGISTRY[server_name]

    # Log the call
    log_request(agent_id, server_name, tool_name, arguments)

    # Forward to actual server
    return await forward_to_server(target["url"], tool_name, arguments)

No tools in the list = no tools the agent can discover. This isn't a soft guideline — it's enforced by the protocol.

What This Actually Solves

The blast radius problem. If an agent has access to 13,000 tools and uses one maliciously, that's a catastrophic blast radius. A gateway lets you enforce least-privilege at the protocol level, not the prompt level. Prompts can be jailbroken. Protocol enforcement can't be.

The audit trail. Every tool call goes through the gateway, so you get structured logs automatically:

{
  "agent_id": "code-review-agent",
  "server": "github",
  "tool": "create_issue",
  "args": {"title": "...", "body": "..."},
  "timestamp": "2026-06-16T09:00:00Z",
  "status": "success"
}

You can replay this, alert on it, or use it for cost attribution.

The drift problem. New MCP servers get added to the ecosystem constantly. A gateway with a registration model means new servers don't automatically become available — they have to be explicitly registered and policy-assigned. No surprise tool access.

Where It Breaks Down

I'm not going to pretend this is a complete solution. A few real problems:

Latency. Every tool call goes through an extra network hop. For latency-sensitive workflows (coding agents making dozens of rapid tool calls), this adds up. Profile before you deploy.

Gateway as a single point of failure. If the gateway goes down, your agents can't call any tools. High availability means running the gateway redundantly, which is doable but not free.

Policy management at scale. When you have 50 agents with different permission levels, the YAML file approach falls apart. You need a real policy store — OPA, Casbin, or a dedicated service. That's more infrastructure.

The human factor. If your agent can dynamically request new permissions, you've just moved the attack surface from "which tools" to "how does it request new access." That's a different problem.

What I Learned

The MCP ecosystem is moving fast — 97M downloads is not a toy. The tools are real, the capability is real, and the security implications are real. The gateway pattern isn't novel (it's just a reverse proxy) but applying it to agent tool access feels new because the problem space is new.

The most practical thing you can do right now: audit which MCP servers your production agents actually have access to. I'd bet money it's more than they need.

If you're running a multi-agent system and not thinking about this, you're building on a foundation you haven't examined. Start with the allowlist. Even a static one is better than "everything, always."

Tags: ai, llm, agents, llmtools

Top comments (2)

Eleftheria Batsou • Jun 16

The thing I'd flag for anyone implementing it: a gateway consolidates the control plane beautifully but it doesn't contain the data plane. If server #8,402 behind the gateway is compromised, the gateway routed the agent there cleanly and the breach still happens at the server's runtime. Gateways solve management; they don't solve containment. You want both layers. Strong writeup though, the 13k framing makes the scale problem concrete.

Some comments may only be visible to logged-in visitors. Sign in to view all comments.