Pico

Posted on Apr 19 • Originally published at getcommit.dev

We Scanned 19 MCP Servers — Here's What We Found

#ai #opensource #security #mcp

We built a static vulnerability scanner with 14 detection patterns across 7 categories — shell injection, path traversal, SSRF, SQL injection, configuration theater, missing authentication, and hardcoded secrets — and pointed it at 19 of the most popular MCP servers on GitHub.

Then we did the part that most security research skips: we manually triaged every single finding.

The scanner produced 862 findings across those 19 repos. After manual review, the majority were false positives. The real vulnerabilities we found were ones the scanner almost missed. And the most dangerous MCP security issues of 2026 — the ones being actively exploited — live in a category that static analysis cannot reach at all.

Here's the full data.

Methodology

Our scanner walks every file in a repository, applies regex-based pattern matching against 14 vulnerability signatures, and produces a weighted score from 0 (clean) to 100 (maximum findings). Each finding is classified as CRITICAL, HIGH, MEDIUM, or LOW based on the pattern type and code context.

What it catches: shell=True with non-literal inputs, multipleStatements: true in MySQL configs, fetch() with non-literal URLs, exec()/execSync() with template literals, hardcoded credentials, path traversal via unsanitized joins, missing authentication on tool registrations.

What it doesn't catch: logic bugs, permission bypasses, supply chain issues, behavioral anomalies, or anything that requires understanding control flow beyond single-file pattern matching.

We know the limitations. That's the point of this article.

The Results

Rank	Repository	Stars	Score	After Triage
1	awslabs/mcp	8.8k	100	Test fixtures + 1 SSRF worth inspecting
2	wonderwhy-er/DesktopCommanderMCP	5.9k	100	Shell injection in build scripts, not runtime
3	googleapis/mcp-toolbox	14.6k	100	All in test files (Go)
4	microsoft/playwright-mcp	31k	100	All in test files
5	benborla/mcp-server-mysql	1.5k	100	All 4 CRITICALs false positive
6	modelcontextprotocol/python-sdk	—	100	CRITICAL false positive
7	modelcontextprotocol/servers	—	100	String concat flagged in descriptions, not SQL
8	idosal/git-mcp	7.9k	100	SSRF in fetch() — warrants inspection
9	executeautomation/mcp-database-server	—	73	Real: CVSS 8.8 SQL injection
10	g0t4/mcp-server-commands	—	72	`shell:true` by design
11-19	(8 more repos)	—	0-62	Low risk or clean

The headline: 8 repos scored 100/100. After triage, zero had a CRITICAL vulnerability in production code. The repo with the confirmed CVSS 8.8 vulnerability scored 73.

The Real Findings

executeautomation/mcp-database-server — SQL Injection (CVSS 8.8)

The MySQL adapter hardcodes multipleStatements: true, and the only defense is a startsWith("SELECT") prefix check — trivially bypassed:

SELECT 1; DROP TABLE users; --

We submitted GHSA-2gc7-7mj4-79wg on April 6. The maintainer never responded. After 7 days, we executed full public disclosure. The advisory remains in triage.

The same repo had a previously published vulnerability (CVE-2025-59333, CVSS 8.1) in its PostgreSQL adapter — same root cause, reported by Liran Tal in September 2025.

mcp-atlassian — SSRF, JQL Injection, XSS

Four vulnerabilities in sooperset/mcp-atlassian (4,400+ stars):

High: SSRF via icon_url — unvalidated URL fetched server-side
High: SSRF via attachment URL — unvalidated URL fetched server-side
Medium: JQL injection — unparameterized query construction
Medium: Stored XSS — unsanitized content in Atlassian comments

This is the same mcp-atlassian independently targeted by MCPwnfluence (CVE-2026-27825/27826) — different bugs, same server, same underlying problem: a single-maintainer package connecting AI agents to enterprise infrastructure.

The False Positives That Matter

benborla/mcp-server-mysql: Score 100, All CRITICALs False

The scanner flagged 4 CRITICAL findings. All false:

multipleStatements: true appeared in test setup scripts and bundled mysql2 driver internals — not in production config. The production connection defaults to false.

"SQL injection via string concatenation" was PEG.js-generated parser internals recognizing SQL keywords, not constructing queries.

After deep review, the real concern was MEDIUM: a parser bypass possible when MYSQL_DISABLE_READ_ONLY_TRANSACTIONS=true removes defense-in-depth.

Lesson: A bundled dependency containing multipleStatements: true as a configuration option definition is not application code enabling it. The scanner couldn't tell the difference.

modelcontextprotocol/python-sdk: Score 100, CRITICAL False

The official MCP Python SDK got flagged for shell=True in two locations. Neither is exploitable:

Hardcoded inputs only (["npx.cmd", "npx.exe", "npx"])
Conditional Windows-only usage in a local CLI tool where attacker equals victim

Neither affects downstream MCP servers, which import mcp.server, not mcp.cli.

What Static Analysis Can't See

While we triaged false positives, the MCP ecosystem experienced its worst month for exploitation:

MCPwn (CVE-2026-33032, CVSS 9.8): Two HTTP requests, no auth, full nginx takeover. 2,600+ instances. Actively exploited.

MCPwnfluence (CVE-2026-27825/27826): SSRF + file write → unauthenticated RCE on the most popular Atlassian MCP server.

Ox Security STDIO Injection Class (10+ CVEs): 150M downloads, ~200K vulnerable instances. Four vectors: transport manipulation, prompt injection to configs, parameter injection, allowlist bypasses. CVEs in Windsurf, Agent Zero, Flowise, GPT Researcher, and more.

None of these would have been caught by our scanner. MCPwn is a logic vulnerability. MCPwnfluence is a chained multi-step attack. The STDIO class exploits the assumption that configuration files aren't modified by prompt-injected AI agents.

Static analysis finds patterns. These attacks exploit behavior.

The Trust Gap

Three takeaways from putting scanner results next to real-world exploits:

1. Static scanning is necessary but not sufficient. It catches configuration mistakes and obvious injection surfaces. It does not catch logic bugs, chained attacks, or agent-mediated vulnerabilities.

2. MCP servers have a structurally high false positive rate. They're designed to execute commands, query databases, and fetch URLs. The patterns that indicate vulnerabilities in web apps are often the intended functionality of MCP tools.

3. The real risk is behavioral. Every exploited MCP server in 2026 was compromised through what it did at runtime, not what its code looked like. The STDIO injection class is the clearest case: sanitization was correct at code review time, but an AI agent modified config at runtime, and no sanitization ran.

This gap is what behavioral monitoring addresses: not "does this code contain dangerous patterns?" but "is this agent doing something inconsistent with its authorization?"

Scan Your Dependencies

Run the scanner on your MCP dependencies:

npx proof-of-commitment mcp-remote @modelcontextprotocol/server-github

Or scan supply chains:

curl -X POST https://poc-backend.amdal-dev.workers.dev/api/graph/npm \
  -H "Content-Type: application/json" \
  -d '{"package": "@modelcontextprotocol/sdk", "depth": 2}'

Web UI: getcommit.dev/audit | GitHub

It will catch the easy stuff. For everything else, you need eyes on behavior.

Data collected April 19, 2026. Disclosed vulnerabilities follow coordinated disclosure timelines. We acknowledge scanner limitations and improve detection patterns based on triage results.

DEV Community