What Is an MCP Proxy - And When Do You Actually Need a Gateway Instead?

#ai #mcp #llm #devops

TL;DR

An MCP proxy forwards requests between AI agents and MCP servers — it handles transport, not governance. Fast to set up, hits a wall the moment you have more than one team or more than two servers
An MCP gateway adds identity, RBAC, audit trails, and per-tool policy enforcement on top of that routing layer — it's where your organization's actual AI policy gets enforced
We started with a proxy, got bitten by the exact things proxies don't handle, and ended up needing a gateway. This post is the thing I wish I'd read before making that call

When I first started wiring up MCP servers for our engineering team, I kept running into the term "MCP proxy" and wasn't entirely sure what it meant or how it differed from an "MCP gateway." Both sit between an AI client and MCP servers. Both forward requests. The difference looked like branding more than substance.

It's not. I figured that out the expensive way.

Here's the clean explanation I eventually pieced together, plus the real-world situation that made the distinction matter.

What an MCP proxy actually is

An MCP proxy is a transport layer. Its job is protocol mediation — it forwards requests from MCP clients to MCP servers, and responses back. That's the whole thing.

The most common reason you'd reach for one is the stdio problem. Claude Code, Cursor, and most local MCP clients speak stdio — they expect to launch a server process and talk to it over stdin/stdout. But if your MCP server is running remotely (inside a Docker container, on a staging server, on someone else's machine), you need something in the middle that wraps that stdio interface and exposes it over HTTP/SSE or WebSockets so the remote client can reach it. That's a proxy.

What a proxy does not do:

It doesn't know what a "tool call" is. It forwards bytes.
It doesn't check who is making the request or whether they should be allowed to
It doesn't enforce policies per tool or per team
It doesn't write audit logs with user attribution
It doesn't handle token management or credential storage

A proxy is the right answer when the real question is "how do I physically get this request from A to B." It's not the right answer when the question is "should this agent be allowed to run this tool, and do I have a record that it did."

For a single developer connecting a local AI client to one MCP server in a dev environment, a proxy is fine — and in fact it's probably all you need. The problems start when you scale horizontally: more developers, more servers, more agents, different teams with different access requirements.

The situation that clarified it for us

We had six MCP servers running internally: GitHub, Confluence, Jira, Sentry, Datadog, and an internal data API. Each team had configured their own local connections — developers were managing credentials themselves, there was no central record of what tools had been invoked, and anyone with a client config could reach any server.

It worked fine until it didn't.

The first problem was credential sprawl. Every developer had their own GitHub OAuth token, their own Jira API key, their own Confluence credentials. When someone left the team, we had to hunt down and revoke six separate credentials across six systems. We missed one. A contractor who had left three weeks earlier still had an active Jira key in their old laptop's MCP config. We only found out during a routine audit.

The second problem was a near-miss with prompt injection. An agent was using the Confluence MCP server to pull documentation into context. A vendor had left a support ticket in Confluence with what turned out to be an injected instruction embedded in the formatting. Claude processed the ticket content and started executing steps from the injected text before a human caught it. Nothing catastrophic happened, but it was a visceral illustration of what "no policy layer between agent and tool" actually means in practice.

The third problem was visibility. When our head of security asked "which agents have accessed our internal data API in the last 30 days, and with what parameters," the honest answer was "we don't know." We had server logs on the API itself, but no correlation to which agent or user identity had triggered each call. The audit trail stopped at the network layer.

That was the moment my team realized we didn't have a proxy problem. We had a governance problem. And a proxy wasn't going to solve it.

The actual difference: proxy vs gateway

Here's the mental model that eventually clicked for me:

A proxy answers: can this request reach its destination?

A gateway answers: should this request be allowed to happen at all — and is there a record that it did?

The distinction looks subtle on a whiteboard. In production, it's the difference between "MCP is running" and "MCP is governed."

Concretely, a gateway adds:

Identity and authentication. The gateway knows who is making the request — not just which client, but which human user, authenticated through your corporate IdP (OAuth 2.0, SAML, SSO). This is what makes access revocation work cleanly: you offboard someone in Okta, their token stops working at the gateway, and they lose access to every MCP server simultaneously.

Tool-level RBAC. Not just "team A can access the GitHub server" but "team A can use search_repositories and read_file, but not push_commit or delete_branch." That granularity is what separates a policy from a vague intention.

Audit trail per tool call. Every invocation logged with user identity, tool name, request parameters, response, and latency. Queryable. Exportable to your SIEM. This is what makes the security team's question answerable.

Pre- and post-execution guardrails. Policy evaluated before the tool runs (should this input be allowed?) and after (does this output contain PII or secrets before it goes back into the agent's context?). This is the prompt injection mitigation — the gateway can inspect tool responses and strip or flag injected instructions before they reach the agent.

Unified credential management. Users authenticate once to the gateway. The gateway handles outbound auth to every downstream MCP server. Credentials live in a vault, not on developer machines.

What we actually ended up using

After the audit incident, we evaluated a few options. I'll be honest after multiple considerations, that we landed on TrueFoundry's MCP Gateway, I can explain specifically why the architecture fit our problem.

The thing that mattered most to us was unified token management. Before the gateway, six servers meant six credential relationships per developer. With TrueFoundry, each developer gets a single Personal Access Token. The gateway maintains the mapping from that token to OAuth credentials for GitHub, Confluence, Jira, Sentry, and Datadog — and refreshes them automatically when they expire. Offboarding is one action: revoke the PAT. Done.

The second thing was Virtual MCP Servers. This is a concept I hadn't seen elsewhere before we built it. Instead of exposing a full MCP server to agents — with all its tools, including the destructive ones — you define a curated logical endpoint that exposes only the tools you want a given team or agent to see. Our product engineering team's "dev tools" endpoint exposes GitHub read tools, Jira read/write, and Sentry. It does not expose the internal data API or the Datadog write tools. Those only appear in the security team's endpoint. Agents see one clean surface; the governance lives in the platform.

The third thing was the guardrail layer. Pre-execution checks validate tool inputs against defined policies before anything runs. Post-execution validation inspects tool responses for PII, secrets, or injected content before it reaches the agent's context. This directly addresses the Confluence prompt injection incident we'd already had.

The performance overhead was not an issue in practice — the docs describe sub-3ms latency under load using in-memory auth and rate limiting rather than DB lookups per request. For agents making dozens of tool calls per workflow, that matters.

When a proxy is actually the right answer

I don't want to make this sound like proxies are always wrong. They're not.

You probably just need a proxy if:

You're a solo developer connecting a local AI client to one or two MCP servers in a dev environment
You're doing a proof-of-concept and governance isn't in scope yet
Your only problem is the stdio-to-HTTP transport gap — you have a local STDIO server and need to expose it remotely

You need a gateway when:

More than one person is using MCP tools and you need to control who has access to what
You need an audit trail that satisfies a security team or compliance requirement
You have agents accessing sensitive internal systems and need to know what they touched
Someone leaving the team means you need to reliably cut off their tool access
You've had (or nearly had) a prompt injection incident via an MCP tool response

The honest version: most teams start with a proxy because it's the fastest path to something working. That's fine. The mistake is treating the proxy as a permanent solution when the system has already grown past what a proxy can govern.

The question worth asking now

If you have MCP tools running in your organization right now, here's the specific question I'd ask: if your security team asked "which agents invoked which tools in the last 30 days, and under whose identity," could you answer it?

If yes - great, your governance layer is working.

If no - you probably have a proxy where you need a gateway.

Curious what others are running here. Are most people still on raw proxy setups, or has the security pressure pushed teams toward proper gateways faster than I'd expect? And has anyone dealt with the prompt injection via MCP tool response problem at scale - would love to hear what actually worked. Comments below.