Stop Letting AI Agents Raw-Dog Your Filesystem: Building SafeMCP

#ai #tutorial #node #security

We need to have a serious talk about the Model Context Protocol.
Everyone is losing their minds over "vibe coding" right now. You plug an MCP server into Cursor, Claude Code, or VS Code, tell the AI to fix a bug across three directories, and go grab a coffee while it spins up local servers, reads files, and executes terminal commands. It feels like absolute magic.
But honestly? It's also completely terrifying.
Maybe I’m just paranoid, but it seems like we’ve collectively skipped the part where we ask ourselves if giving a statistical text-prediction engine raw, unvetted access to our local machines is a good idea.
Some security folks are already warning that we’re walking directly into a massive remote code execution crisis. Think about it. Most MCP servers run as local subprocesses. They inherit your exact user permissions. If you run your editor as an admin or with access to sensitive environment variables, so does the AI.
And the real issue isn't that the AI will spontaneously turn evil. The issue is prompt injection.

The Security Void in the Hype

I spent some time looking through public MCP servers on GitHub recently, and the sheer lack of input validation is wild. Because developers are rushing to build cool tools, basic security hygiene has completely lagged behind.
If an AI agent reads an untrusted string—like a malicious comment in a GitHub issue, an automated email, or a dirty record inside a database—it can easily be manipulated into executing an injection payload. The model doesn't know the difference between your system instructions and the data it's processing. It treats them exactly the same.
What happens when a prompt injection tricks a standard filesystem MCP tool into looking for a file named


../../../../../../etc/passwd

or pulling your private AWS keys? The tool just does it. It’s a classic path traversal vulnerability, except instead of a malicious hacker typing it into a web form, an automated agent is doing it because a piece of text told it to.
Traditional firewalls don't catch this. It's happening entirely inside standard input and output streams (stdin/stdout) on your local hardware.
I wanted a way to use these autonomous agents without constantly worrying that one bad web scrape would wipe my local drive. So, I built a local firewall specifically for the MCP ecosystem.
It's called SafeMCP Gateway.

What is SafeMCP?

Instead of trying to patch fifty different third-party MCP servers, I realized the security layer needs to exist at the transport level.
SafeMCP acts as an isolation barrier and standard I/O proxy. It sits directly between your AI client (like Cursor) and your target MCP servers. It intercepts every single JSON-RPC 2.0 frame passing through the pipeline, dissects the arguments, evaluates them against local security rules, and only forwards them if they pass.

[ AI Client ]  --->  [ SafeMCP Gateway ]  --->  [ Target MCP Server ]
  (Cursor)            (Security Checks)           (Raw Filesystem/DB)

If the AI tries to do something shady, SafeMCP cuts it off at the pass and returns a clean error frame back to the client. The actual backend server never even sees the dangerous payload.
Here is what it handles right now:

Path Traversal Shield: It cleans and resolves all path arguments to their absolute, canonical forms. If an operation tries to sneak outside your defined workspace folder using .. hacks, it gets blocked instantly.
Subprocess Guard: It completely blocks shell metacharacters and restricts executable binaries to a strict, tight allowlist.
SSRF Network Blocker: Agents love trying to phone home. SafeMCP cuts off access to private subnets, loopback interfaces, and cloud metadata endpoints (like the notorious AWS IMDSv2 address 169.254.169.254).
Forensic Auditing: Every single transaction gets dumped into a local, structured safemcp-audit.jsonl file with anonymized parameters so you can review exactly what your agent was trying to do behind your back. ## Zero Dependencies I have a bit of a pet peeve with modern security tools that require you to download half of the internet just to run. Installing 300 random NPM packages to protect yourself from a vulnerability feels entirely counterproductive—you’re just swapping one supply-chain risk for another. Because of that, I wrote SafeMCP using exclusively native Node.js core modules. Zero production dependencies. The codebase is lean, transparent, and easy to audit yourself.

How to Setup SafeMCP

Using it is pretty straightforward. It works as a command wrapper. You clone it, build it, and then change your MCP configuration files to route through it using a double-dash (--) separator to isolate the gateway's parameters from the actual server command.
For example, to secure the official Anthropic filesystem server inside Claude Desktop, your claude_desktop_config.json would look like this:

{
  "mcpServers": {
    "secure-filesystem": {
      "command": "node",
      "args": [
        "/path/to/safemcp-gateway/dist/safemcp-gateway.js",
        "--workspace",
        "/your/safe/development/folder",
        "--",
        "npx",
        "-y",
        "@modelcontextprotocol/server-filesystem",
        "/your/safe/development/folder"
      ]
    }
  }
}

If you're using Cursor, you just add a new MCP server with the type set to command and drop a similar string right into your settings panel.
Once it's active, it's completely invisible. It stays out of your way unless a tool parameter actually violates a security constraint. If a prompt injection hits your agent, the gateway drops a -32602 JSON-RPC error frame, your editor prints "Access Denied", and your host machine stays completely safe.

Check out the code

The project is completely open-source. If you want to check out the source code, contribute a policy rule, or set it up on your own machine, the repository is up on GitHub:
👉 Angwyn/safemcp-gateway
If agentic software engineering is actually going to work out long-term, we have to start decoupling our transport protocols from our security frameworks. We can't keep trusting raw strings. Stop giving your models root access to your machine without a gatekeeper.
Take a look, try it out, and let me know if you run into any weird edge cases or if there are specific security rules you think should be added to the default blocklist!

Top comments (1)

Raju Dandigam • Jul 2

Your point that the scary part of MCP is not "the model turns evil" but that local subprocesses inherit your exact user permissions is the right framing. The path-traversal example is especially useful because it translates prompt injection into a failure mode infra teams already understand: untrusted input crossing a capability boundary. In production I have found the biggest win is making the filesystem and tool layer enforce its own allowlist and audit trail so the model never gets to reason its way around scope. agent-inspect has pushed me toward the same pattern on the observability side: if you cannot see the tool call, arguments, and denial reason, you cannot harden it. Curious whether you see SafeMCP living mostly as a local dev guardrail or as a pattern teams will embed inside hosted agent runtimes too.