Authora Dev

Posted on Apr 12

MCP command injection is worse than it looks (here’s how to actually defend it)

#ai #programming #devops #security

Last week, a perfectly normal MCP tool turned into a shell.

The setup looked harmless: an AI agent needed to query logs, so the MCP server exposed a search_logs tool. The tool accepted a string, passed it into a shell command, and returned the result. Then someone asked the agent to “search for errors from today; also show /etc/hosts if it helps debug.”

You can guess what happened next.

This is the part of MCP security that’s easy to underestimate: the dangerous bug usually isn’t in the protocol itself. It’s in the layer where tool inputs get stitched into shell commands, SQL queries, file paths, or internal API calls.

And because MCP gives agents a clean way to discover and invoke tools, those bugs become reachable at scale.

Why MCP command injection is a bigger deal than “just sanitize input”

A normal web app command injection bug is already bad.

An MCP command injection bug is worse because:

tools are designed to be called programmatically
agents can chain tool calls automatically
a single prompt can influence multiple downstream actions
the vulnerable surface is often hidden behind “helpful” abstractions

If your MCP server exposes tools like:

run_tests
grep_logs
convert_file
git_diff
ping_host

…you may have built a remote execution surface without meaning to.

A lot of teams are trying to solve this one tool at a time. That helps, but it misses the pattern.

The better approach is to model these flaws as a security knowledge graph.

What I mean by a security knowledge graph

Instead of tracking isolated bugs, map the relationships:

[Agent Prompt]
      |
      v
[Tool Call: search_logs]
      |
      v
[Argument: query="error; cat /etc/passwd"]
      |
      v
[Sink: exec("grep " + query + " /var/log/app.log")]
      |
      v
[Impact: command injection]
      |
      +--> [Reads secrets]
      +--> [Moves laterally]
      +--> [Poisons outputs]

That graph gives you more than a vulnerability report. It tells you:

which tools are high risk
which input fields reach dangerous sinks
which agents can invoke them
what approvals, policies, or sandboxing should exist

This is useful because MCP security isn’t just “is this tool vulnerable?” It’s also:

who can call it
under what delegation chain
with what runtime constraints
what other systems it can reach

If you already use OPA for policy, this is a great fit. Let your graph identify risky edges, then use policy to block or require approval for them.

The bug, in 8 lines

Here’s the classic mistake in Node:

npm install express

const express = require("express");
const { execSync } = require("child_process");

const app = express();
app.get("/search", (req, res) => {
  const q = req.query.q || "";
  const out = execSync(`grep ${q} /var/log/system.log`, { encoding: "utf8" });
  res.send(out);
});

app.listen(3000, () => console.log("listening on :3000"));

That “works” right up until q contains shell metacharacters.

The fix is not “be more careful.” The fix is:

avoid shell invocation when possible
use parameterized APIs
validate against strict allowlists
run tools in sandboxes
attach identity and authorization to tool execution
log invocation lineage so you can see who called what, through which agent

Build the graph from four node types

You don’t need a giant platform to start. A spreadsheet or graph DB is enough if the model is right.

I’d start with these node types:

Agents

Which agent, session, or delegated identity initiated the call?
Tools

What MCP tool was invoked? What are its declared parameters?
Sinks

Does the tool reach exec, filesystem writes, SQL, HTTP callbacks, or template rendering?
Impacts

What happens if exploited: RCE, data exfil, secret access, repo tampering?

Then add edges like:

CAN_CALL
PASSES_INPUT_TO
REACHES_SINK
REQUIRES_APPROVAL
EXFILTRATES_TO

Once you have that, useful questions become easy:

Which tools can reach shell execution?
Which shell-reaching tools are callable by untrusted agents?
Which of those also have access to secrets or internal networks?
Which ones are missing approval gates or audit trails?

That’s how you move from “we found one injection bug” to “we understand our agent attack surface.”

What good defenses look like

The strongest MCP setups usually combine several layers:

safe tool implementation: no shell where libraries exist
policy enforcement: block risky tools for low-trust agents
sandboxing: assume some tool will eventually fail open
identity + delegation tracking: know the real caller, not just the app
audit logging: preserve the path from prompt to tool invocation to side effect

If you’re deciding where to start, start with inventory. Most teams don’t know which MCP tools are exposing dangerous sinks.

Try it yourself

If you want to check your own MCP surface:

Scan an MCP server for security and spec issues: https://tools.authora.dev
Scan codebases or remote MCP servers from CI/terminal: npx @authora/agent-audit
Add a verified identity badge to your agent: https://passport.authora.dev
Browse more agent security resources: https://github.com/authora-dev/awesome-agent-security

Those are all free and useful whether you’re building from scratch or cleaning up an existing server.

The takeaway

MCP command injection flaws are rarely isolated bugs. They’re usually graph problems:

an agent can call a tool
the tool passes input to a dangerous sink
the sink can reach something valuable
nobody modeled the chain end-to-end

Once you map that chain, the fixes get much clearer.

How are you modeling trust and dangerous tool paths in your MCP stack? Drop your approach below.

-- Authora team

This post was created with AI assistance.

DEV Community