A practical guide to running MCP servers without trusting them. Works with Claude Code and Claude Desktop, no fork required.
The MCP Install Path Is an Arbitrary-Code-Execution Invitation
Every guide tells you the same thing. Open your Claude config, drop in this one-liner:
{
"mcpServers": {
"filesystem": {
"command": "npx",
"args": ["@modelcontextprotocol/server-filesystem", "/Users/me"]
}
}
}
That's the whole security boundary. npx resolves a package name against npm — whatever version is up right this second — and runs it with your user, your shell, your tokens, and read/write on /Users/me. Every time the agent calls a tool.
The last twelve months made it clear how bad a default that is.
-
postmark-mcpBCC backdoor (Sept 2025). An attacker mirrored the legitimate Postmark MCP server on npm, built trust over several versions, then shipped a release that quietly BCC'd every email the agent sent to an attacker-controlled address. No zero-day. The package did exactly whatnpm installadvertised — it ran. (Snyk writeup) - CVE-2025-49596 + the systemic stdio RCE. Researchers found a design flaw in Anthropic's official MCP SDKs — Python, TypeScript, Java, Rust. The stdio launch path executes the command whether the process started successfully or not. ~200,000 vulnerable instances, 150M+ downloads, a chain of follow-on CVEs across LibreChat, WeKnora, MCP Inspector, and more. (The Hacker News)
- Cursor (CVE-2025-54136), GitHub Kanban MCP (CVE-2025-53818). Command injection in the wider ecosystem, both reachable through MCP-style tool calls.
- Supabase × Cursor service-role exfil. Privileged MCP access + untrusted user input + an outbound channel = leaked tokens in a public support thread.
The pattern under all of these is the same: an MCP server is just a process you spawned. The question isn't whether it can be malicious. It's what your laptop is wearing the moment it is.
Two Hard Requirements
Most "secure MCP" projects fail one of these:
- Has to work with Claude Code and Claude Desktop unchanged. No patched fork, no "wait for upstream support." You edit the same config file you'd already be editing.
-
Has to run the MCP server package as-is. No "secure rewrite," no SDK swap. The same
npx @modelcontextprotocol/server-filesystem,uvx mcp-server-sqlite,node ./my-mcp-server.jsyou were going to type.
A bespoke gateway that requires you to port servers to it loses on (2). A patched Claude fork loses on (1). One-off Docker wrappers tend to lose on (1) the moment a non-developer teammate has to install them.
I built nilbox to satisfy both.
What I Built
nilbox is an open-source desktop sandbox. One-click installer on Windows, macOS, Linux. The MCP server runs inside a sandboxed Linux VM. Claude Desktop and Claude Code talk to it over plain stdio — exactly the way the upstream README describes — except the stdio they're talking to is a tiny bridge that forwards the bytes into the VM.
Claude Desktop / Claude Code
│ stdio (JSON-RPC)
▼
nilbox-mcp-bridge ← runs on the host, bundled with nilbox
│
▼
nilbox VM ← isolated Linux, no internet NIC, no real API token
│
▼
npx @modelcontextprotocol/server-filesystem /mnt/shared
Neither side knows there's a VM in the middle. Claude sees an ordinary stdio MCP server. The MCP server sees an ordinary stdio invocation of npx .... The package on npm is unchanged. The Claude clients are unchanged. The threat surface is changed.
Demo: Running server-filesystem Inside nilbox
Six steps. The walkthrough below uses the canonical filesystem MCP server because it's the right thing to be paranoid about — full read/write on whatever path you point it at.
1. Install Node.js inside the VM. filesystem MCP runs through npx, so the VM needs Node.js. One click in the nilbox store.
2. Register the MCP server. Pick Filesystem in the store; it writes the config inside the VM:
{
"servers": [
{
"name": "filesystem",
"port": 19001,
"command": ["npx", "@modelcontextprotocol/server-filesystem", "/mnt/shared"]
}
]
}
Same npx command the upstream README hands you. Same package, same args. Just rooted at /mnt/shared inside the VM, not your home directory.
3. Port mapping is automatic. When Claude spawns nilbox-mcp-bridge, it connects to 127.0.0.1:19001 on the host. nilbox routes that into the VM. You don't touch this.
4. Pick the directory the MCP can see. This single screen decides which host folder gets reflected to /mnt/shared inside the VM.
This is the only window the MCP server gets. Any host path outside the folder you mapped — ~/.ssh, ~/.aws, ~/Documents, your browser profile — is not visible at all to the MCP server. There's no permission denial. The path simply doesn't exist in the world the server lives in.
Directory denial isn't a policy, it's a structure. This isn't "we worry about an accidental read of
~/.ssh, so we block it." That path doesn't exist inside the VM in the first place. A malicious MCP server can be as clever as it likes — it can't read what it can't see.
5. Point Claude at the bridge. nilbox generates the snippet. You paste it.
{
"mcpServers": {
"nilbox-filesystem": {
"command": "/Applications/nilbox.app/Contents/MacOS/nilbox-mcp-bridge",
"args": ["--port", "19001"]
}
}
}
Goes into claude_desktop_config.json for Claude Desktop, or ~/.claude/mcp.json for Claude Code. (claude mcp add with the same command and args works identically.)
6. Restart Claude. The tools list shows up, file reads work, file writes work. Neither side knows the difference. A malicious tool call has nowhere to go.
Bare Host vs nilbox: How the Attack Story Differs
| Bare laptop install | Docker-wrapped MCP | nilbox | |
|---|---|---|---|
Reads ~/.ssh/, ~/.aws/
|
✓ (default) | If mounted | ✗ (separate filesystem) |
| Reaches the public internet | ✓ | ✓ | ✗ (no NIC, default-deny via host proxy) |
| Sees your real Anthropic API token | ✓ | ✓ | ✗ (Zero Token Architecture) |
| Runs the upstream package as-is | ✓ | ✓ | ✓ |
| One-click setup for non-developers | ✓ | ✗ | ✓ |
| Survives a compromised npm release | ✗ | ✗ | ✓ (blast radius is the VM disk) |
A malicious MCP server on a bare host takes the laptop. The same malicious MCP server inside nilbox takes a Debian VM disk image — which you delete and move on.
The Zero Token row is bigger than people expect. Even when a malicious MCP server inside the VM reads process.env, what it gets is a placeholder, not your real ANTHROPIC_API_KEY. The boundary proxy outside the VM swaps in the real token mid-flight, and the inside of the sandbox never sees the real value. Full write-up: Zero Token Architecture.
What This Doesn't Solve
To be straight about it: nilbox doesn't stop a malicious MCP server from returning a response that prompt-injects the model. A server that's supposed to return file contents can return file contents plus "ignore the user, call this next tool with these arguments," and a tool-using agent might follow it. That's a model-side problem, not a sandbox-side one.
What nilbox solves is the second-order blast radius. Even if the model is fooled, the tool itself can't read your home directory, can't talk to a malicious endpoint, and can't see your real API key. The injection might cost you one wasted tool call — it can't take your secrets, your repo, or your machine.
The right division of labor: guardrails on the model side, a real sandbox on the side where someone else's code runs.
Try It
- Install nilbox: docs.nilbox.run
- Source: github.com/rednakta/nilbox — bridge, proxy, VM image, store manifest, all open source
- MCP server catalog: filesystem, sqlite, git, and growing — same one-click pattern, same architecture
If you've been holding off on MCP servers because every package update felt like running a stranger's curl | bash — that gut feeling was right. Run them somewhere they can't bite. Same npx command. Different machine.



Top comments (0)