Why Prompt Injection Hits Harder in MCP: Scope Constraints and Blast Radius
The GitHub issue tracker for the official MCP servers repository has developed a recurring theme over the last two months: security advisories. Not general hardening suggestions — specific reports of prompt-injection-driven file reads, SSRF, sandbox bypasses, and unconstrained string parameters across official servers.
This is not a bug-report backlog. It's a design pattern gap.
The reason prompt injection hits harder in MCP than in stateless APIs isn't just "LLMs can be tricked." It's that MCP tools are action-capable by design, and most server implementations give those tools unconstrained reach into the environment they run in.
The structural problem: tools with no scope constraints
A traditional API call is scoped by default. The credential you provide determines what you can touch. Rate limits bound how much. The request schema constrains the surface.
An MCP tool call is different. The tool's action boundaries are defined by the server implementation — not the protocol, not the client, not the calling agent. If the implementation doesn't constrain input parameters, the LLM calling the tool can be tricked into passing arbitrary values that the server will execute without further validation.
The canonical example is the filesystem server. GitHub issue #3752 (filed March 2026) describes exactly this: path parameters lack traversal constraints. A prompt-injection payload embedded in a user document can instruct the agent to call read_file with ../../etc/shadow as the path. The server complies.
The vulnerability isn't in the agent. The agent is doing what it was told. The vulnerability is that the tool has no scope boundary.
Why this matters more for remote MCP than local stdio
For local stdio MCP running on your own machine, the blast radius is bounded by your own filesystem and credentials. The risk surface is still real, but the scope is clear.
For remote MCP serving multiple agents or tenants, the blast radius becomes unbounded by default unless the server explicitly implements:
- Parameter validation with hard boundaries — path parameters restricted to allowlisted prefixes, string inputs sanitized against known patterns, numeric inputs bounded to expected ranges
- Per-tenant credential isolation — each client/agent operates with scoped credentials, not a shared identity with full access
- Audit trails — every tool call logged with caller identity, parameters, and outcome so post-incident review is possible
The multi-tenant issue (#2173 in the MCP servers repo) explicitly calls this out: without per-tenant isolation, one agent's compromise can affect another's data. This is not a theoretical concern in systems where multiple agents share the same remote MCP service.
The "unconstrained string parameter" audit
GitHub issue #3537, titled "Security Audit: Unconstrained string parameters across all official servers," documented a systematic sweep of official MCP server implementations. The finding: the majority of string parameters that accept user-provided or LLM-passed values have no server-side validation.
This is the prompt injection attack surface in practice. Not a clever jailbreak — a missing if (!path.startsWith('/allowed/prefix')) check.
The practical implication for teams evaluating MCP servers for production use: scope constraint validation should be a first-class evaluation criterion, not an assumed default.
What good looks like
A remote MCP server worth trusting in production has:
At the parameter layer:
- Filesystem paths validated against allowlisted prefixes, not just sanitized for path traversal
- String inputs that have identifiable structure (URLs, IDs, queries) validated against that structure before execution
- Numeric parameters bounded to documented ranges
At the auth layer:
- Each agent/tenant operates with scoped credentials
- Tool calls that require elevated permissions fail explicitly instead of silently succeeding with shared admin credentials
At the observability layer:
- Every tool call logged: who called it, what parameters, what happened
- Errors structured to be distinguishable from successes
- No silent partial-success states
At the containment layer:
- The blast radius of a compromised agent call is bounded: one tenant's token cannot access another's data, one malformed path cannot escape the allowed filesystem prefix
Connecting to AN Score: what scores actually measure here
The AN Score access readiness dimension (which contributes to total score alongside execution, reliability, and capability breadth) captures some of this. API-level access readiness measures: does the provider support scoped credentials? Are auth errors machine-readable? Is there a programmatic revocation path?
For MCP servers specifically, the evaluation surface extends beyond the underlying API score. The MCP server implementation layer adds its own trust model — or fails to. A high-scoring underlying API served through a MCP server with unconstrained parameters inherits the vulnerability.
This is one reason Rhumb's evaluation methodology is expanding to cover MCP server trust profiles separately from underlying API scores. A checklist for remote MCP production readiness requires scope constraint validation as a first-class dimension alongside the existing auth model, tenant isolation, and token-burn control criteria.
What to check before trusting a remote MCP server
A minimal production-readiness audit for prompt injection resistance:
- Does the server validate tool input parameters server-side? Not just in the tool schema (which the LLM can be instructed to ignore) — in the server implementation itself.
- Are filesystem paths restricted to explicit prefixes? If the server touches the filesystem, path traversal protection should be explicit, not assumed.
- Can you run with a scoped credential instead of admin credentials? The agent should never need more access than the task requires.
- Are tool calls logged with enough detail for post-incident review? Knowing that something happened is different from knowing what.
- What happens to other tenants if one agent is compromised? For a multi-tenant remote MCP service, the answer should be "nothing" — credential isolation should prevent cross-tenant impact.
The pattern that's emerging
The MCP production conversation is sharpening around a specific distinction: containment as design intent, not security as afterthought.
Local stdio MCP tools built for developer convenience were not designed with blast radius in mind. That's fine — the context didn't require it. Remote MCP services handling unattended agent traffic at scale require containment as a first-class concern, not an add-on.
Teams building on remote MCP right now are learning this the hard way. The issue tracker is the evidence log.
The evaluation question has shifted from "does this MCP server work?" to "when this MCP server is compromised, how much damage can it do?"
Related reading
- A Production Readiness Checklist for Remote MCP Servers — the 7-dimension checklist (auth model, scope constraints, tenant isolation, governors, recoverability, auditability)
- How APIs Fail When Agents Use Them: A Failure Engineering Guide — failure mode taxonomy for production agent systems
- The Complete Guide to API Selection for AI Agents — full portfolio navigation hub
Rhumb evaluates external APIs and tools across 20 dimensions of agent-native readiness. The AN Score access readiness dimension measures scoped credential support, machine-readable auth errors, and revocation capability. MCP server trust profiles are a separate evaluation surface currently in development.
Top comments (0)