I was adding MCP tools to a project when I realized something uncomfortable: I had no idea what the code I was installing could actually do.
The README said "connects Claude to Blender." What it didn't say was that one of the registered tools passes a raw string parameter to Python's exec() with no builtin restriction. The LLM doesn't get "Blender API access." It gets full Python execution on the host machine.
I wanted a way to know this before running the code. So I built one.
What reachscan does
reachscan is a static analysis CLI for Python and TypeScript/JavaScript AI agent codebases. Point it at a repo, a PyPI package, or an MCP endpoint, and it tells you:
- What the code can do (shell exec, file access, network calls, credential access, dynamic code execution)
- Which of those capabilities the LLM can actually trigger (reachability analysis)
- The exact call path from the LLM entry point to the dangerous code
pip install reachscan
# Scan a GitHub repo
reachscan https://github.com/user/repo
# Scan a PyPI package before installing
reachscan pypi:some-agent-package
# Scan local code
reachscan ./my-agent
That's it. No config, no API keys, no cloud service. It runs offline and produces a report in about 2 seconds.
The problem
When you give an LLM tools, you're granting it real-world capabilities like file access, shell commands, network calls, credential reads. Most frameworks make it easy to add tools and hard to audit what you've exposed.
Here's real code from a popular MCP server:
@mcp.tool()
def execute_blender_code(ctx: Context, code: str) -> str:
"""Execute arbitrary Python code in Blender."""
blender = get_blender_connection()
result = blender.send_command("execute_code", {"code": code})
That code: str parameter? It ends up here:
exec(code, {"bpy": bpy}) # No __builtins__ restriction
namespace = {"bpy": bpy} looks like a sandbox. It isn't. Without explicitly setting __builtins__, Python injects the full builtins module. The LLM can import os, run subprocess, read your files — anything.
Here's what reachscan shows for this server:
DYNAMIC exec() server.py:431 reachable
path: execute_blender_code → send_command → execute_code
EXECUTE subprocess.run() addon.py:89 reachable
SEND requests.post() server.py:198 reachable
path: generate_3d_model → _call_api
SECRETS os.environ[...] server.py:12 module_level
The reachable tag is the key part. It means the LLM can trigger this code through a registered tool and not just that the code exists somewhere in the repo. module_level means it runs on import. unreachable means the code exists but no LLM call path leads to it.
How it works (briefly)
- Detectors scan the AST for 7 capability categories: EXECUTE, READ, WRITE, SEND, SECRETS, DYNAMIC, AUTONOMY
-
Entry point detection finds the functions exposed to the LLM —
@tool,@mcp.tool(),@function_tool,BaseToolsubclasses, etc. across LangChain, OpenAI Agents SDK, MCP, Pydantic AI, CrewAI, Semantic Kernel, and AutoGen - Call graph + BFS traces up to 8 hops from each entry point to determine which capabilities are actually reachable
- Every finding gets one of 5 states:
reachable,unreachable,module_level,unknown,no_entry_points
The false positive rate is 0.47% across 1,912 labeled findings on 10 real-world repos. I care about this number a lot because a noisy scanner is a useless scanner.
Why I built it
The short version: I was evaluating third-party MCP servers and realized there was no npm audit equivalent for AI agent code. I could run pip audit to check for known vulnerabilities in dependencies, but nothing told me "this package gives the LLM shell access on your machine."
The existing tools I found either:
- Require API calls per scan (expensive, not offline)
- Produce flat capability lists without reachability context (noisy)
- Don't handle the MCP/agent-specific entry point patterns
So I built the tool I wanted.
What it found across 50 real MCP servers
I ran reachscan against 50 of the most popular MCP server repos:
- 1 in 3 has shell execution capability
- 1 in 3 has outbound network I/O
- 1 in 4 accesses credentials from environment variables
- 10 of 50 had 4+ capabilities active simultaneously
The highest-risk combination: credential access + network egress. That appeared in 8 of 50 repos. If the LLM can read your AWS keys AND make HTTP calls, that's an exfiltration path.
Not all of these are bugs. An AWS MCP server should talk to AWS. The question is whether the LLM can misuse those capabilities — and whether you know about them before you deploy.
Try it
pip install reachscan
# Scan any GitHub repo
reachscan https://github.com/ahujasid/blender-mcp
# Scan a PyPI package before installing
reachscan pypi:openai-agents
# JSON output for CI
reachscan . --json --severity high
Apache 2.0, pure Python, runs offline. No API keys, no cloud service.
If something looks wrong — false positive, missed pattern, bad output — open an issue.
GitHub: vinmay/reachscan
PyPI: reachscan
Full scan results (50 repos): Medium writeup
Top comments (0)