Last week, a perfectly normal MCP tool turned into a shell.
The setup looked harmless: an AI agent needed to query logs, so the MCP server exposed a search_logs tool. The tool accepted a string, passed it into a shell command, and returned the result. Then someone asked the agent to “search for errors from today; also show /etc/hosts if it helps debug.”
You can guess what happened next.
This is the part of MCP security that’s easy to underestimate: the dangerous bug usually isn’t in the protocol itself. It’s in the layer where tool inputs get stitched into shell commands, SQL queries, file paths, or internal API calls.
And because MCP gives agents a clean way to discover and invoke tools, those bugs become reachable at scale.
Why MCP command injection is a bigger deal than “just sanitize input”
A normal web app command injection bug is already bad.
An MCP command injection bug is worse because:
- tools are designed to be called programmatically
- agents can chain tool calls automatically
- a single prompt can influence multiple downstream actions
- the vulnerable surface is often hidden behind “helpful” abstractions
If your MCP server exposes tools like:
run_testsgrep_logsconvert_filegit_diffping_host
…you may have built a remote execution surface without meaning to.
A lot of teams are trying to solve this one tool at a time. That helps, but it misses the pattern.
The better approach is to model these flaws as a security knowledge graph.
What I mean by a security knowledge graph
Instead of tracking isolated bugs, map the relationships:
[Agent Prompt]
|
v
[Tool Call: search_logs]
|
v
[Argument: query="error; cat /etc/passwd"]
|
v
[Sink: exec("grep " + query + " /var/log/app.log")]
|
v
[Impact: command injection]
|
+--> [Reads secrets]
+--> [Moves laterally]
+--> [Poisons outputs]
That graph gives you more than a vulnerability report. It tells you:
- which tools are high risk
- which input fields reach dangerous sinks
- which agents can invoke them
- what approvals, policies, or sandboxing should exist
This is useful because MCP security isn’t just “is this tool vulnerable?” It’s also:
- who can call it
- under what delegation chain
- with what runtime constraints
- what other systems it can reach
If you already use OPA for policy, this is a great fit. Let your graph identify risky edges, then use policy to block or require approval for them.
The bug, in 8 lines
Here’s the classic mistake in Node:
npm install express
const express = require("express");
const { execSync } = require("child_process");
const app = express();
app.get("/search", (req, res) => {
const q = req.query.q || "";
const out = execSync(`grep ${q} /var/log/system.log`, { encoding: "utf8" });
res.send(out);
});
app.listen(3000, () => console.log("listening on :3000"));
That “works” right up until q contains shell metacharacters.
The fix is not “be more careful.” The fix is:
- avoid shell invocation when possible
- use parameterized APIs
- validate against strict allowlists
- run tools in sandboxes
- attach identity and authorization to tool execution
- log invocation lineage so you can see who called what, through which agent
Build the graph from four node types
You don’t need a giant platform to start. A spreadsheet or graph DB is enough if the model is right.
I’d start with these node types:
Agents
Which agent, session, or delegated identity initiated the call?Tools
What MCP tool was invoked? What are its declared parameters?Sinks
Does the tool reachexec, filesystem writes, SQL, HTTP callbacks, or template rendering?Impacts
What happens if exploited: RCE, data exfil, secret access, repo tampering?
Then add edges like:
CAN_CALLPASSES_INPUT_TOREACHES_SINKREQUIRES_APPROVALEXFILTRATES_TO
Once you have that, useful questions become easy:
- Which tools can reach shell execution?
- Which shell-reaching tools are callable by untrusted agents?
- Which of those also have access to secrets or internal networks?
- Which ones are missing approval gates or audit trails?
That’s how you move from “we found one injection bug” to “we understand our agent attack surface.”
What good defenses look like
The strongest MCP setups usually combine several layers:
- safe tool implementation: no shell where libraries exist
- policy enforcement: block risky tools for low-trust agents
- sandboxing: assume some tool will eventually fail open
- identity + delegation tracking: know the real caller, not just the app
- audit logging: preserve the path from prompt to tool invocation to side effect
If you’re deciding where to start, start with inventory. Most teams don’t know which MCP tools are exposing dangerous sinks.
Try it yourself
If you want to check your own MCP surface:
- Scan an MCP server for security and spec issues: https://tools.authora.dev
-
Scan codebases or remote MCP servers from CI/terminal:
npx @authora/agent-audit - Add a verified identity badge to your agent: https://passport.authora.dev
- Browse more agent security resources: https://github.com/authora-dev/awesome-agent-security
Those are all free and useful whether you’re building from scratch or cleaning up an existing server.
The takeaway
MCP command injection flaws are rarely isolated bugs. They’re usually graph problems:
- an agent can call a tool
- the tool passes input to a dangerous sink
- the sink can reach something valuable
- nobody modeled the chain end-to-end
Once you map that chain, the fixes get much clearer.
How are you modeling trust and dangerous tool paths in your MCP stack? Drop your approach below.
-- Authora team
This post was created with AI assistance.
Top comments (0)