Over the past few weeks, I've been spending a lot of time looking at the security of AI agents.
Not the models themselves.
The infrastructure around them.
Specifically, MCP servers.
As more companies adopt AI agents, MCP servers are becoming the bridge between models and the real world. They connect agents to tools, databases, APIs, file systems, internal services, and external workflows.
Which got me thinking:
What happens when these servers are exposed to the internet?
So I decided to find out.
I analyzed 492 publicly accessible MCP servers and ran a series of behavioral security tests against them.
The goal wasn't exploitation.
The goal was understanding how these systems behave when they encounter adversarial inputs.
How I Scanned Them
For each MCP server, I performed a combination of:
- Tool enumeration
- Permission boundary analysis
- Prompt injection testing
- Command execution testing
- Context manipulation testing
- Behavioral security evaluation
The focus wasn't traditional vulnerability scanning.
Instead, I wanted to answer a different question:
Can an attacker influence what an AI-connected tool does simply by manipulating instructions?
Unfortunately, the answer was often yes.
The Most Surprising Finding
Out of the 492 MCP servers analyzed:
43% showed signs of command injection susceptibility.
Not because they were running outdated software.
Not because authentication was broken.
But because many systems implicitly trusted agent-generated instructions.
That trust created risk.
Pattern #1: Natural Language Becomes Shell Commands
One common pattern looked something like this:
A user asks:
"List all files in the project."
The MCP server converts that request into a shell command.
The problem is that weak validation often means the same pathway can process far more than intended.
When agent-controlled input reaches shell execution, things can get dangerous very quickly.
One-line fix:
Never pass agent-controlled input directly into shell execution. Use allowlisted commands and structured parameters.
Pattern #2: Free-Form Instructions Become Database Queries
Another recurring issue involved database access.
The agent receives a natural language request.
The request is transformed into a query.
Without strict controls, the scope of that query can expand far beyond what was originally intended.
The result isn't always a traditional injection vulnerability.
Sometimes it's simply excessive access.
One-line fix:
Use parameterized queries and strict scope enforcement. Never generate queries directly from free-form instructions.
Pattern #3: Tool Chaining Creates New Capabilities
This was probably the most interesting category.
Individually, the tools looked safe.
A search tool.
A file access tool.
An HTTP request tool.
Nothing unusual.
But when chained together by an autonomous agent, entirely new capabilities emerged.
Search became retrieval.
Retrieval became extraction.
Extraction became transmission.
The issue wasn't the tools.
It was the combination.
One-line fix:
Validate permissions at every tool boundary, not just when the agent starts.
The Bigger Problem
After reviewing hundreds of MCP servers, one thing became clear.
Most security teams are still thinking about AI infrastructure using traditional application security models.
They're asking:
- Is authentication enabled?
- Is the API protected?
- Is the network secure?
Those questions still matter.
But AI systems introduce something new.
Behavioral security.
The system isn't necessarily compromised.
It's persuaded.
And that's a fundamentally different challenge.
Why I Built a Tool for This
After manually evaluating hundreds of MCP servers, it became obvious that this process doesn't scale.
That's why I built a framework to automate:
- MCP discovery
- Behavioral testing
- Prompt injection evaluation
- Command injection detection
- Permission boundary analysis
- Tool-chain security testing
The goal isn't to find bugs.
The goal is to identify risky behavior before attackers do.
Final Thoughts
The biggest lesson from scanning 492 MCP servers wasn't that AI systems are insecure.
It was that many of them trust instructions far more than they should.
As AI agents gain access to more tools, more data, and more autonomy, security can no longer stop at infrastructure.
We need to test behavior too.
That's one of the reasons I started building Crucible — an open-source security framework for testing AI agents, MCP servers, and agentic systems against real-world adversarial scenarios.
Top comments (1)
43% feels low - I'd have guessed higher for anything wired to a model and left on a public IP. The whole MCP rush is people handing an agent shell access, db creds and a filesystem, then putting it online because isolating it was the boring part.
--dangerously-skip-permissionsas a service.