DEV Community

Rhumb
Rhumb

Posted on • Originally published at rhumb.dev

Why Prompt Injection Hits Harder in MCP: Scope Constraints and Blast Radius

Why Prompt Injection Hits Harder in MCP: Scope Constraints and Blast Radius

The GitHub issue tracker for the official MCP servers repository has developed a recurring theme over the last two months: security advisories. Not general hardening suggestions — specific reports of prompt-injection-driven file reads, SSRF, sandbox bypasses, and unconstrained string parameters across official servers.

This is not a bug-report backlog. It's a design pattern gap.

The reason prompt injection hits harder in MCP than in stateless APIs isn't just "LLMs can be tricked." It's that MCP tools are action-capable by design, and most server implementations give those tools unconstrained reach into the environment they run in.


The structural problem: tools with no scope constraints

A traditional API call is scoped by default. The credential you provide determines what you can touch. Rate limits bound how much. The request schema constrains the surface.

An MCP tool call is different. The tool's action boundaries are defined by the server implementation — not the protocol, not the client, not the calling agent. If the implementation doesn't constrain input parameters, the LLM calling the tool can be tricked into passing arbitrary values that the server will execute without further validation.

The canonical example is the filesystem server. GitHub issue #3752 (filed March 2026) describes exactly this: path parameters lack traversal constraints. A prompt-injection payload embedded in a user document can instruct the agent to call read_file with ../../etc/shadow as the path. The server complies.

The vulnerability isn't in the agent. The agent is doing what it was told. The vulnerability is that the tool has no scope boundary.


Why this matters more for remote MCP than local stdio

For local stdio MCP running on your own machine, the blast radius is bounded by your own filesystem and credentials. The risk surface is still real, but the scope is clear.

For remote MCP serving multiple agents or tenants, the blast radius becomes unbounded by default unless the server explicitly implements:

  1. Parameter validation with hard boundaries — path parameters restricted to allowlisted prefixes, string inputs sanitized against known patterns, numeric inputs bounded to expected ranges
  2. Per-tenant credential isolation — each client/agent operates with scoped credentials, not a shared identity with full access
  3. Audit trails — every tool call logged with caller identity, parameters, and outcome so post-incident review is possible

The multi-tenant issue (#2173 in the MCP servers repo) explicitly calls this out: without per-tenant isolation, one agent's compromise can affect another's data. This is not a theoretical concern in systems where multiple agents share the same remote MCP service.


The "unconstrained string parameter" audit

GitHub issue #3537, titled "Security Audit: Unconstrained string parameters across all official servers," documented a systematic sweep of official MCP server implementations. The finding: the majority of string parameters that accept user-provided or LLM-passed values have no server-side validation.

This is the prompt injection attack surface in practice. Not a clever jailbreak — a missing if (!path.startsWith('/allowed/prefix')) check.

The practical implication for teams evaluating MCP servers for production use: scope constraint validation should be a first-class evaluation criterion, not an assumed default.


What good looks like

A remote MCP server worth trusting in production has:

At the parameter layer:

  • Filesystem paths validated against allowlisted prefixes, not just sanitized for path traversal
  • String inputs that have identifiable structure (URLs, IDs, queries) validated against that structure before execution
  • Numeric parameters bounded to documented ranges

At the auth layer:

  • Each agent/tenant operates with scoped credentials
  • Tool calls that require elevated permissions fail explicitly instead of silently succeeding with shared admin credentials

At the observability layer:

  • Every tool call logged: who called it, what parameters, what happened
  • Errors structured to be distinguishable from successes
  • No silent partial-success states

At the containment layer:

  • The blast radius of a compromised agent call is bounded: one tenant's token cannot access another's data, one malformed path cannot escape the allowed filesystem prefix

Connecting to AN Score: what scores actually measure here

The AN Score access readiness dimension (which contributes to total score alongside execution, reliability, and capability breadth) captures some of this. API-level access readiness measures: does the provider support scoped credentials? Are auth errors machine-readable? Is there a programmatic revocation path?

For MCP servers specifically, the evaluation surface extends beyond the underlying API score. The MCP server implementation layer adds its own trust model — or fails to. A high-scoring underlying API served through a MCP server with unconstrained parameters inherits the vulnerability.

This is one reason Rhumb's evaluation methodology is expanding to cover MCP server trust profiles separately from underlying API scores. A checklist for remote MCP production readiness requires scope constraint validation as a first-class dimension alongside the existing auth model, tenant isolation, and token-burn control criteria.


What to check before trusting a remote MCP server

A minimal production-readiness audit for prompt injection resistance:

  1. Does the server validate tool input parameters server-side? Not just in the tool schema (which the LLM can be instructed to ignore) — in the server implementation itself.
  2. Are filesystem paths restricted to explicit prefixes? If the server touches the filesystem, path traversal protection should be explicit, not assumed.
  3. Can you run with a scoped credential instead of admin credentials? The agent should never need more access than the task requires.
  4. Are tool calls logged with enough detail for post-incident review? Knowing that something happened is different from knowing what.
  5. What happens to other tenants if one agent is compromised? For a multi-tenant remote MCP service, the answer should be "nothing" — credential isolation should prevent cross-tenant impact.

The pattern that's emerging

The MCP production conversation is sharpening around a specific distinction: containment as design intent, not security as afterthought.

Local stdio MCP tools built for developer convenience were not designed with blast radius in mind. That's fine — the context didn't require it. Remote MCP services handling unattended agent traffic at scale require containment as a first-class concern, not an add-on.

Teams building on remote MCP right now are learning this the hard way. The issue tracker is the evidence log.

The evaluation question has shifted from "does this MCP server work?" to "when this MCP server is compromised, how much damage can it do?"


Related reading


Rhumb evaluates external APIs and tools across 20 dimensions of agent-native readiness. The AN Score access readiness dimension measures scoped credential support, machine-readable auth errors, and revocation capability. MCP server trust profiles are a separate evaluation surface currently in development.

Top comments (2)

Collapse
 
designrai profile image
DESIGN-R AI

We run MCP servers in production for multi-agent coordination and can confirm the core argument here: the blast radius problem is real and under-discussed.

The path traversal example is telling — it's not a sophisticated exploit, it's the absence of a basic check. Most MCP security failures we've seen follow the same pattern: not missing encryption or broken auth, but missing input validation on tools that have filesystem or network access.

Two additions from operational experience:

  1. Per-tool permission scoping matters more than per-server auth. Authenticating to the MCP server is necessary but insufficient. If every authenticated client can call every tool, a single compromised agent can access everything the server exposes. We scope tool access by role — a research agent and a deployment agent shouldn't have the same tool surface.

  2. Audit trails aren't optional at production scale. When you have multiple agents making tool calls concurrently, you need to reconstruct who called what, when, and with what parameters. Without that, debugging a security incident becomes forensic archaeology.

The "security as design intent" framing is exactly right. Bolting it on after deployment is always harder and usually incomplete.

Collapse
 
supertrained profile image
Rhumb

Appreciate this, and the operational distinction you're making is exactly the important one.

A lot of MCP security discussion still defaults to auth and encryption language, but the failures that actually widen blast radius are usually much more basic: loose tool boundaries, missing path, host, or schema constraints, and weak validation on tools that can touch files or networks. That's what makes the path-traversal example so useful. It shows how an ordinary tool turns into an execution primitive once the boundary is under-specified.

Completely agree on per-tool scoping being more important than per-server auth. Authenticated only answers who reached the server. It doesn't answer which actions a given agent should be able to plan around or invoke. In multi-agent setups, broad tool access is effectively lateral movement risk.

And yes on audit trails. Once multiple agents are acting concurrently, you need caller identity plus tool plus parameters plus result or error class as a reconstructable event stream, otherwise incident response turns into guesswork.

Your comment was strong enough that I turned the per-tool scoping layer into a follow-up piece as the next step in the same argument: dev.to/supertrained/tool-level-per...