What mcpwall Does and Doesn't Protect Against
Security tools that hide their limitations aren't security tools.
I published mcpwall's full threat model. Here's the summary: what's covered, what isn't, and what's next.
Where mcpwall sits
mcpwall is a transparent stdio proxy between your AI coding tool and the MCP server. Every JSON-RPC message from the client passes through the policy engine. Rules are YAML, evaluated top-to-bottom, first match wins.
The key word is request firewall. In v0.1.x, mcpwall inspects what your AI agent asks to do. It does not yet inspect what the server sends back.
Inbound (inspected): Claude Code → mcpwall → MCP Server
Outbound (logged only): Claude Code ← mcpwall ← MCP Server
8 attack classes blocked out of the box
No configuration needed. These default rules apply automatically and scan every argument value recursively:
-
SSH key theft — blocks
.ssh/,id_rsa,id_ed25519,id_ecdsain any argument -
.env file access — blocks
.envand all variants -
Credential files — AWS credentials,
.npmrc, Docker config, kube config,.gnupg - Browser data — Chrome, Firefox, Safari profiles, cookies, login data
-
Destructive commands —
rm -rf,mkfs,dd if=,format C: -
Pipe-to-shell —
curl/wget/fetchpiped tobash/sh/python/node -
Reverse shells — netcat,
/dev/tcp/,bash -i,mkfifo,socat - Secret / API key leakage — 10 patterns (AWS, GitHub, OpenAI, Stripe, etc.) + Shannon entropy threshold
Plus: JSON-RPC batch bypass fixed, ReDoS mitigation, symlink resolution, crash protection.
Known limitations
These are attack classes that mcpwall v0.1.x does not mitigate. We're publishing them because hiding limitations is worse than having them.
High severity
- Response-side attacks — Server responses forwarded unfiltered. A compromised server can return secrets in tool results. Planned: v0.2.0.
- Base64 / URL encoding bypass — Rules match literal strings only. Encoded secrets or commands pass through.
- Rate limiting / DoS — No throttling on tool call volume. Planned: v0.4.0.
Medium severity
- Tool description poisoning / rug pulls — mcpwall doesn't inspect tool metadata. A server can change descriptions after trust is established. Planned: v0.3.0.
- Prompt injection — Can't detect semantic LLM manipulation. Sees the resulting tool call, not the manipulation — but may still catch the dangerous arguments.
-
Shell metacharacter bypass — Pipes caught; semicolons,
&&, backticks,$()not covered by default rules. - Unicode / DNS exfiltration / env leakage — Out of scope for v0.1.x.
Low severity
- Config tampering (TOCTOU), log integrity, timing side-channels, deep nesting stack overflow.
Defense in depth
mcpwall is one layer, not the whole stack:
- Install-time scanning — Tools like mcp-scan check tool descriptions before you use a server.
- Runtime firewall (mcpwall) — Enforces policy on every tool call as it happens.
- Container isolation — Limits blast radius if a server is compromised.
What's next
| Version | Feature |
|---|---|
| v0.2.0 | Response inspection — scan server responses for secrets |
| v0.3.0 | Tool integrity / rug pull detection |
| v0.3-4 | HTTP/SSE proxy mode — support remote MCP servers |
| v0.4.0 | Rate limiting |
Read the full threat model
The complete reference includes component-by-component analysis, all default rule details, severity ratings, trust boundary diagrams, and the full list of assumptions.
GitHub: github.com/behrensd/mcp-firewall
npm: npmjs.com/package/mcpwall
Website: mcpwall.dev
Originally published at mcpwall.dev/blog/mcpwall-threat-model.
Top comments (0)