Takayuki Kawazoe

Posted on May 2

3 MCP server failure modes that bit us in production, and how we ship around them

#mcp #claude #python #devops

MCP feels easy until it isn't. The first time you wire up a stdio server and call a tool from a Claude Agent SDK loop, the whole thing fits on a slide. Then you put it in front of customer codebases, customer GitHub credentials, customer build containers, and the sharp edges show up in places the spec is silent on. Tools start shadowing each other. The agent confidently uses a built-in Read when you wanted it to go through your sandboxed file server. Environment variables you set on the parent process reappear inside the MCP child and become tokens-in-prompts.

I'm building a SaaS that uses MCP heavily across a few different services (Codens, an AI dev harness with several specialized agents — happy to talk about it but it isn't the point of this post). Across those services we have GitHub MCPs for repo reads, an in-process Playwright MCP for browser exploration, and per-workspace local-file MCPs that let an agent navigate a cloned repo without escaping the sandbox. Three failure modes have shown up enough to rewrite the way we build this stuff.

Below is each one with the code we actually ship, not a hypothetical fix.

Failure 1: built-in tools shadow your MCP tools, silently

The first time it bit us was in the test-case generation flow. The agent was supposed to be reading files only through a workspace-scoped MCP server — a thin wrapper that resolves paths, refuses to escape the workspace root, and truncates large files. Standard sandbox stuff.

@tool(
    "read_file",
    "Read file content from local workspace. Provide the relative file path.",
    {"file_path": str},
)
async def read_file_tool(args: dict[str, Any]) -> dict[str, Any]:
    file_path = args["file_path"]
    if file_path.startswith("/"):
        file_path = file_path[1:]
    full_path = workspace_path / file_path
    if not full_path.exists():
        return {"status": "error", "message": f"File not found: {file_path}"}
    try:
        full_path.resolve().relative_to(workspace_path.resolve())
    except ValueError:
        return {"status": "error", "message": "Access denied: path outside workspace"}
    # ... read + truncate ...

That relative_to check is the whole point of the wrapper. The agent should not be able to read /etc/passwd, or /app/... (where our backend code lives in the container), or anything else outside the cloned customer repo. The MCP server enforces the boundary; the agent only ever sees what the tool returns.

The bug: the Claude Agent SDK ships built-in tools — Read, Write, Edit, Glob, Grep, Bash — and unless you tell it otherwise, those built-ins are also available. The agent had two ways to read a file: the sandboxed mcp__local__read_file, and the unsandboxed built-in Read. We passed allowed_tools listing the MCP tools by full name and assumed that was an allowlist. It wasn't tight enough — built-ins kept being chosen.

The first sign was test cases that referenced files the agent couldn't possibly have seen if it was only reading from the customer's workspace. It was reading our own backend code from /app/ because the container Python path was sitting right there, and the built-in Read tool happily walked into it.

The fix is two-part, and both parts matter:

options = ClaudeAgentOptions(
    model=self._model,
    max_turns=40,
    system_prompt=system_prompt,
    cwd=str(self._workspace_path),
    mcp_servers={"local": self._mcp_server},
    allowed_tools=[
        "mcp__local__read_file",
        "mcp__local__list_directory",
        "mcp__local__get_file_structure",
        "mcp__local__get_multiple_files",
    ],
    disallowed_tools=[
        "Read",
        "Write",
        "Edit",
        "Glob",
        "Grep",
        "Bash",
        "MultiEdit",
        "NotebookEdit",
    ],
)

allowed_tools is necessary but not sufficient. The agent treats it as "these are allowed", not as "these are the only tools that exist." If a built-in is available and not explicitly in disallowed_tools, it stays in the toolbox. We also reinforce the boundary in the system prompt with text the model actually pays attention to:

=== CRITICAL FILE ACCESS RULES ===
1. You MUST ONLY use these MCP tools to access files:
   - mcp__local__read_file
   - mcp__local__list_directory
   - mcp__local__get_file_structure
   - mcp__local__get_multiple_files
2. NEVER use built-in tools like Read, Glob, Bash, or Grep
3. NEVER read files from /app/ - that is a different application
4. ALL file paths you read should be relative to the workspace

Note point 3 — explicit-path negation. We learned that one the hard way. The model has a tendency to "explore" by trying common system paths if it's confused about the workspace. Telling it NEVER read files from /app/ (the container path where our service actually runs) is dumb-looking but absolutely necessary to stop the model from leaking implementation details about the host into its analysis output.

Setting cwd=str(self._workspace_path) is the third leg. Without it, when a built-in did slip through during early debugging, it ran with the SDK's default working directory, which made the boundary violation worse. With cwd set, even an accidental tool use is at least scoped to the right tree.

If I were starting again, I'd treat the agent's tool list as a default-deny zone from line one. The mental model "built-ins exist; my MCP tools are added on top" is wrong for any production agent that needs sandboxing. Default-deny everything, then pass allowed_tools listing exactly the MCP tool names you want, then triple-belt-and-suspenders by enumerating the built-ins in disallowed_tools so a future SDK version that adds a new built-in (Read2, Search, etc.) doesn't quietly enable it for you. The cost of the extra few lines is nothing. The cost of an agent reading the wrong filesystem in production is everything.

Failure 2: tool name collisions across MCP servers, with no useful error

Once you have more than one MCP server in the same agent loop, you find out fast that "the same tool name in different servers" is a thing. Our regression test generator runs two MCPs simultaneously — a local server for code analysis and a playwright server for browser interaction:

mcp_servers: dict[str, Any] = {"playwright": playwright_server}
allowed_tools = [
    "mcp__playwright__browser_navigate",
    "mcp__playwright__browser_snapshot",
    "mcp__playwright__browser_get_links",
    "mcp__playwright__browser_click",
    "mcp__playwright__browser_type",
    "mcp__playwright__browser_wait",
    "mcp__playwright__browser_get_visited_pages",
    "mcp__playwright__browser_console_messages",
]
if self._local_mcp_server:
    mcp_servers["local"] = self._local_mcp_server
    allowed_tools += [
        "mcp__local__read_file",
        "mcp__local__list_directory",
        "mcp__local__get_file_structure",
    ]

options = ClaudeAgentOptions(
    model=self._model,
    max_turns=60,
    system_prompt=system_prompt,
    mcp_servers=mcp_servers,
    allowed_tools=allowed_tools,
)

Notice the prefixing convention: every tool name is mcp__<server>__<tool>. That's not cosmetic. It is the difference between "the agent knows which server to call" and "two read_file tools shadow each other and you find out at runtime which one it picks."

The collision case for us was real. Across our services, several different MCP servers each define a read_file tool — one for the workspace, one for GitHub repo reads, one inside the GitHub-tools server we use for fix generation:

# github-tools server: reads from a remote repo via the GitHub API
@tool(
    "read_file",
    "Read file content from GitHub repository",
    {"owner": str, "repo": str, "file_path": str, "branch": str},
)
async def read_file_tool(args: Dict[str, Any]) -> Dict[str, Any]:
    content = await github_client.get_file_content(...)

# local server: reads from the cloned workspace
@tool(
    "read_file",
    "Read file content from local workspace. Provide the relative file path.",
    {"file_path": str},
)
async def read_file_tool(args: dict[str, Any]) -> dict[str, Any]:
    full_path = workspace_path / file_path

Same name, different schemas, different intent. If both servers are mounted into the same agent loop and you list the tools as bare read_file, the SDK has to pick one, and which one it picks is not the kind of thing you want the LLM's prompt-time context to silently determine. Even worse, the agent will eventually try to call the other one and pass the wrong arg shape — {"owner": ..., "repo": ..., "file_path": ...} against the local server's {"file_path": ...} schema — and you get either a tool error the agent retries past, or, worse, an opaque "tool not found" that the agent reasons around by giving up on file access entirely.

The fix is to never refer to MCP tools by bare name. Always use mcp__<server>__<tool>. The system prompt gets paranoid too:

ONLY use mcp__local__ tools. Start with get_file_structure.

Use BOTH mcp__local__ (code analysis) and mcp__playwright__ (browser) tools.

The leading mcp__local__ prefix shows up in the system prompt because the model picks tools partly from the prompt's pattern matching. Saying "use the read_file tool" leaves it ambiguous; saying "use mcp_local_read_file" tells it both the server and the operation, and the model is much less likely to pick a built-in or the wrong server's tool.

There's a deeper point here about how you scope a MCP server. We deliberately scope by concern, not by protocol: there's a github server (operations against the GitHub API), a local server (operations against a cloned workspace), and a playwright server (operations against a live browser). Within each concern, tool names are short and don't repeat across concerns. We do not have one mega-server with 30 tools because at that point the system prompt has to disambiguate which tool to use within the same namespace, and that's a much harder prompt-engineering problem than just naming the servers right.

If I were starting again: pick a strict naming convention for your MCP servers (we use lowercase singular nouns: local, github, playwright, appium) and a strict prefix for every tool reference, both in allowed_tools and in the system prompt. Treat any unprefixed tool name in your code or prompts as a code smell. The 10 minutes you spend writing a small lint rule that flags read_file without mcp__<server>__ will save you hours of debugging "why did the agent call the wrong server" in production.

Failure 3: the parent process's environment bleeds into the MCP server

This one is the most interesting because the bug isn't in the MCP transport — it's in how MCP server subprocesses (and even in-process tools that hit external APIs) inherit the parent's process environment.

Concretely: our automatic-fix flow runs against customer repositories. Some customers configure their own AWS credentials so the fix-generation agent can run their tests in a sandbox. Those credentials get loaded by the use case and passed through to the SDK as extra env vars:

# generate_fix_use_case.py
extra_env: Dict[str, str] = {}
if project.credential_set_id:
    extra_env = await self._credential_use_case.get_env_vars_for_project(
        project_id=project.id,
    )
    if "AWS_DEFAULT_PROFILE" not in extra_env and default_profile:
        extra_env["AWS_PROFILE"] = default_profile

    if extra_env:
        logger.info(
            f"Injecting {len(extra_env)} extra env vars for project "
        )

So far so good. The agent needs AWS_* to talk to the customer's AWS, and we pass them in via extra_env. The bug is what happens when this collides with our own ANTHROPIC_API_KEY.

The naive way to merge extra env into the agent's environment — and the way we wrote it the first time — looks like this:

# WRONG. extra_env can overwrite ANTHROPIC_API_KEY.
env = {
    "ANTHROPIC_API_KEY": settings.ANTHROPIC_API_KEY.get_secret_value(),
    **(extra_env or {}),
}
options = ClaudeAgentOptions(model=self._model, env=env)

Spread order matters. With extra_env last, anything in extra_env overrides keys we set first. If a customer's credentials bundle happens to contain an ANTHROPIC_API_KEY (for whatever reason — they were experimenting, they have an internal Bedrock+Anthropic setup, they pasted the wrong env block into the credentials UI), that key now silently replaces ours in the agent's process. We'd be billing the agent's work to the wrong key, the customer would be billing the agent's work to their key, or — depending on what the value actually was — the SDK would fail authentication entirely and we'd see "401" errors that look like quota issues but aren't.

The fix is one line, but only because we now know it's a one-line fix:

# Create agent options — extra_env is merged first so ANTHROPIC_API_KEY cannot
# be overridden by user-supplied credentials
env = {
    **(extra_env or {}),
    "ANTHROPIC_API_KEY": settings.ANTHROPIC_API_KEY.get_secret_value(),
}
options = ClaudeAgentOptions(
    model=self._model,
    max_turns=10,
    env=env,
)

extra_env first, our key last. Now the customer's env can't shadow our key, no matter what they put in their credentials bundle. The comment on top of every one of these blocks (we have four of them across analyze_error, generate_fix, improve_fix_with_test_feedback, and analyze_and_fix_error_with_agent) is verbatim the same text, because if anyone ever "cleans up" the merge order in a future refactor, they need a giant flag in the diff that says no, this is load-bearing.

There's a wider principle here that's worth saying directly: anything you let into an MCP child process's environment is data you've handed to a place the LLM can read indirectly. If the MCP server prints diagnostics with env vars in them, those go to logs the model can see in some configurations. If the MCP server reads os.environ to pick up credentials and accidentally reflects them in an error message, those land in the tool result the model receives. Treat the env passed to an MCP server like you'd treat the request body of an internal API: minimum scope, validated keys, no secrets you wouldn't want the model to be aware of.

The other half of this fix is logging discipline. When extra_env shows up, we log only its size:

if extra_env:
    logger.info(
        f"Injecting {len(extra_env)} extra env vars for project "
    )

Not the keys. Not the values. A count. If a customer reports "the agent didn't see my AWS creds," the answer is in the count and a separate audit log of which credential set was selected, never in a debug dump of the env block itself. The same log line that says "we injected 4 vars" tells you the wiring is working without putting the secrets where any future debugger can grab them.

If I were starting again: I would build a build_env(*sources, base) helper from day one, with base always last in the spread and a hardcoded list of "system-owned" keys (the Anthropic key, our own internal service tokens) that simply cannot be overridden no matter what comes in *sources. The four spread-order copies in our codebase all do the same thing; they would do it once if I'd written a helper. They don't, because each one was added when its calling use case was added, and the merge-order discipline was learned at a time when the helper would have meant a refactor I didn't have time for. Every new place you spread env into an SDK options object is a place you can get the order wrong. Centralize it.

What these three failure modes have in common

The failure modes all live at the same seam: the boundary between the agent process and the things the agent process delegates to. With built-in shadowing, the agent has access to a tool you didn't want it to have because you didn't enumerate the disallowed set. With name collisions, the agent has access to two tools by the same name and resolves which one ambiguously. With env bleeding, the agent's children inherit context (credentials, paths, caches) that the parent didn't deliberately decide to pass through.

In each case the spec was technically silent. There's nothing in the MCP spec that says "you must shadow built-ins by listing them in disallowed_tools" because the SDK is the one that exposes built-ins, not MCP. There's nothing that says "always prefix tool names" because the MCP spec is per-server; the prefixing convention comes from how you wire multiple servers into one agent. Env inheritance is a Unix process detail that has nothing to do with MCP. But all three behave like MCP problems because they're invisible until the agent is running and you can't reason backwards from a "tool not found" or a "wrong API key" message to which seam was leaky.

The mental model that helped most was: the MCP server lifecycle is its own concern, not a free side effect of the agent process lifecycle. Just because your agent process was started with the right environment, the right tool list, and the right working directory doesn't mean the MCP children are running with that same shape. Every option you set on ClaudeAgentOptions — env, cwd, allowed_tools, disallowed_tools, mcp_servers, max_turns — has a "what does this look like from inside an MCP child" reading, and the production-correct answer is almost always more restrictive than the demo-correct answer.

Closing

If you've shipped a multi-MCP-server agent into production, I'd love to hear which of these bit you and which ones I haven't found yet. Especially: how do you handle MCP servers that have legitimate reasons to need real environment variables (a Stripe MCP that genuinely needs STRIPE_API_KEY, an internal-tools MCP that needs a service token), without giving up the "no parent env bleed" property? I have an answer that involves a small per-call env builder, but I'm half-convinced there's a cleaner pattern I haven't seen yet.

Top comments (1)

PEACEBINFLOW • May 2

The env spread order thing is one of those bugs that's almost too small to learn from, which is exactly why everyone hits it. You fixed the merge order and it works, but I think the deeper issue is that a dictionary merge is the wrong primitive for this job. When you spread extra_env and then your own keys, you're using position to encode ownership—last writer wins—but position is silent. Nothing in the code screams "these keys are system-owned and must not be overridden." A future developer sees two dictionaries being merged and assumes order doesn't matter, because 99% of the time it doesn't.

What I'd want instead is something that makes the ownership explicit at the key level, not the position level. A small helper that takes a list of overridable keys and a list of protected keys, and if extra_env contains a protected key, it doesn't silently overwrite—it warns, or errors, or skips with a log line. Something like merge_env(user_env, system_env, protected={"ANTHROPIC_API_KEY", "SERVICE_TOKEN"}). That way the protection survives a refactor where someone alphabetizes the spreads or consolidates them into a single dict comprehension.

The logging discipline you mentioned—only logging the count, never the keys or values—is the other half of this that I don't think enough teams internalize. It's tempting to log the key names for debugging, but key names in a credentials bundle can be revealing on their own. Seeing STRIPE_LIVE_SECRET_KEY in a log tells you the customer has a production Stripe integration. That's information the logging system doesn't need. Curious if you've considered going further and having the env builder also validate that no protected key appears as a substring anywhere in the user-supplied values. That sounds paranoid, but I've seen credentials leak through error messages where the value of one env var gets interpolated into another's error output. The boundary between "environment for the MCP server" and "data the model might see" is thinner than any of us want to admit.