Hey everyone, I built a security scanner for MCP servers (mcpsafe.io) and ran it across the public catalog I'd indexed from npm, PyPI, and GitHub — about 5,000 active servers, 2,634 of which produced at least one finding. The results were rougher than I expected.
What's broken, by % of servers affected:
-
51% — unpinned GitHub Actions (
uses: actions/checkout@v4instead of a SHA). Tag rewrites are silent. - 45% — HTTP / socket / subprocess calls without a timeout. Hang-forever territory.
-
41% — overbroad MCP tool input schemas (
z.string(), barestr,{"type":"string"}on fields namedcommand,query,url). The exact shape that lets prompt injection through. -
37% —
except: passswallowing errors with no logging. -
28% — Dockerfiles with no
USERdirective, so the container runs as root. -
22% — npm/pip install-time hooks (
postinstall, customcmdclass). Code execution before you ever import anything. -
19% — server binds to
0.0.0.0. DNS rebinding is real. - 11% — pinned to dependency versions with known CVEs in the OSV database.
A small set of severe findings keeps showing up too: 97 servers had runtime-secret-exfil patterns (env vars or KMS plaintext returned in tool responses); 88 had user input concatenated into the system role of an inner LLM call without sanitization. Those are the bugs that make the news.
Why this is more than the usual SAST stuff:
MCP servers are different because every tool description, return value, and file the server reads ends up inside an LLM's context. An overbroad schema isn't just sloppy — it's a prompt-injection surface. A silenced exception isn't just bad logging — it's where a malicious tool quietly succeeds.
What MCPSafe.io does: 43 rules right now, all MCP-specific, mapped to CWE. Free public scanning at mcpsafe.io, no signup. Paste a GitHub repo, npm package, or PyPI package, get a result. Deep scans run a 5-judge LLM consensus (Bedrock, OpenAI, Mistral, Vertex) to filter low-confidence findings.
If you maintain an MCP server, the free path will catch most of the issues above. If you find a false positive, every finding has a "report" link that goes to my inbox.
Curious to hear which patterns I'm missing. Thank you!
Top comments (0)