Ricardo Ferreira

Posted on Mar 30

The Seven Deadly Sins of MCP: Security Sins

#ai #mcp #api #agents

This part of the series focuses on the security sins: Lust and Greed. They belong in this category because they answer the two questions that decide blast radius before anything else: what the model can reach, and how much authority it carries when it gets there.

If a model can touch something real, these are the first sins that matter. A shell command, a production write, a filesystem path, or a token with more scope than the task deserves can turn a clever demo into a security incident very quickly.

Lust and greed belong together because both are really about access boundaries. Lust is what happens when the model is given unsafe intimacy with sensitive systems or side effects. Greed occurs when it is given more authority than the task deserves. One is about dangerous surfaces. The other is about excessive scope. In practice, they often show up together and are often discovered together.

In MCP, that boundary is visible at the protocol surface itself: what capabilities are being advertised to the client, whether a tool is model-controlled, and what sits between the model and the side effect once the capability is exposed.

Lust

Lust is creating unsafe intimacy between the model and sensitive systems or side effects. This is the most dramatic sin and probably the easiest one for an audience to feel immediately. It is also the one most likely to get hand-waved as innovation. "This is where the magic happens" is what developers say, and usually means the model is about to get direct access to something powerful.

How to spot it

A model can reach shell execution, database writes, or production mutations through a generic interface.
Untrusted content is flowing into tools that have real side effects.
Destructive actions do not require confirmation, review, or narrow intent.
A demo feels exciting precisely because it's a little dangerous.

Example

This is what happens when an assistant grows from read-only visibility into action. An engineer wants repo status during an incident, a quick diff before a deploy, or the latest commits after a rollback. Those are narrow, legitimate requests. But if the server exposes a generic command surface rather than those exact intents, the user who thought they were getting observability has actually been given leverage over a sensitive machine or workflow.

Before

server.tool("run_git", async ({ args }) => {
  return await exec(`git ${args}`);
});

Narrower, but still second-best

const ALLOWED_GIT_COMMANDS: Record<string, string[]> = {
  status: ["status", "--short"],
  diff: ["diff", "--stat"],
  recent_commits: ["log", "--oneline", "-n", "10"],
};

server.tool("run_git", async ({ command }) => {
  const argv = ALLOWED_GIT_COMMANDS[command];
  if (!argv) {
    throw new Error("command not allowed");
  }

  return await execFile("git", argv);
});

Better still: move from commands to intent

server.tool("get_repo_status", async () => {
  return await execFile("git", ["status", "--short"]);
});

server.tool("get_recent_commits", async () => {
  return await execFile("git", ["log", "--oneline", "-n", "10"]);
});

How to fix it

The first improvement is to stop passing raw command text around. If you cannot move to intent-specific tools immediately, at least map a small set of allowed intents to fixed argument arrays rather than accepting free-form command strings. But that is still a transitional state. The real fix is to narrow the relationship between the model and the system. Replace generic execution surfaces with task-specific tools whenever you can, and put confirmation or human approval in front of destructive actions. If untrusted content is flowing into something with side effects, sanitize it and constrain it before it gets anywhere near the dangerous part.

There is also a harder conclusion that teams sometimes avoid: some capabilities should not be exposed through MCP from the host at all. If the only safe version still depends on a highly privileged machine, broad local credentials, or an execution environment that is too dangerous to trust to model-mediated calls, the right answer may be to keep that workflow behind a human-operated CLI or move it into a narrower service boundary first.

High-risk tools also need runtime boundaries, not just good intentions. Run them in restricted environments with tight OS and network permissions, and threat-model prompt injection as a system design problem rather than just an LLM behavior problem. That shift matters because it changes where you place defenses: not only in prompts but also in the tool design, the execution environment, and the approval path.

Lessons from the trenches

This is exactly the kind of boundary failure shown in GHSA-3q26-f695-pp76, the command-injection advisory for @cyanheads/git-mcp-server, and GHSA-q66q-fx2p-7w4m, the filesystem symlink advisory for the official MCP servers repo.

Greed

Greed is granting broader access, authority, or scope than the task deserves. This one is everywhere in MCP systems because least privilege takes work, and admittedly, demos reward convenience. So teams start broad. One scope. One credential set. One tool that can read everything. The intention is temporary. The temporary choice then becomes the architecture.

How to spot it

A read-only workflow still asks for write-capable or user-level credentials.
A tool can reach far more files, tables, APIs, or repos than the task really needs.
Permissions are explained with "we'll narrow it later."
Security review gets uncomfortable long before the demo team does.

Example

This is how internal tools usually drift into overreach. A team starts with a narrow request: let the assistant read one repo's docs, inspect one support folder, or look up one customer's account details. The user and the reviewer both think they approved a bounded capability. The implementation ships with the process's full filesystem reach, or a credential that can see far more than the task requires.

Before

@server.tool("read_file")
async def read_file(path: str):
    return Path(path).read_text()

After

ROOT = Path("/srv/support-docs").resolve()

@server.tool("read_file")
async def read_file(path: str):
    if not path or not path.strip():
        raise ValueError("path is required")

    full_path = (ROOT / path).resolve()
    if ROOT not in full_path.parents and full_path != ROOT:
        raise ValueError("path is outside allowed directory")

    return full_path.read_text()

Better still: split by task, not by path

@server.tool("get_refund_policy")
async def get_refund_policy(plan: Literal["monthly", "annual"]):
    return (ROOT / "refunds" / f"{plan}.md").read_text()

@server.tool("get_password_reset_runbook")
async def get_password_reset_runbook():
    return (ROOT / "runbooks" / "password-reset.md").read_text()

How to fix it

The fix for greed begins with drawing a real boundary. Write down exactly which directories, APIs, databases, or repositories the model should be allowed to access, then shape the tools around that boundary rather than convenience. Read paths and write paths should not share the same credentials or the same code path if the task is only supposed to observe. The strongest version of that pattern is task-shaped access rather than open-ended browsing. A support assistant often does not need "read anything under support-docs." It needs "get the refund policy for this plan" or "fetch the password reset runbook."

From there, review scopes and tokens as you would IAM permissions or network access. In MCP, that review should cover the full capability surface, not just one handler: the tools a client can call, the resources it can browse, and any prompts that might route the model toward privileged actions. Add authorization tests, not just happy-path tool tests, and expect a little product friction along the way. Least privilege usually means more granular credentials, narrower tools, and sometimes a few more workflow steps. Still, that extra design work is what keeps temporary convenience from turning into permanent overreach.

This is also where teams get fooled by gateway comfort. A proxy can enforce authentication, rate limits, and logging around an MCP server, and those controls are valuable. But they do not redeem a backend capability that is already too broad. If the underlying API can read the whole tenant, mutate too much state, or blur read and write authority, wrapping it in a better edge does not change the sin. It only makes the sin easier to expose consistently.

Lessons from the trenches

The mcp-server-git advisory GHSA-5cgr-j3jf-jw3v clearly showed that git_init had arbitrary filesystem access. The mcp-reddit example in the MCP fault-taxonomy paper showed the same impulse in a different form: read-only operations requesting more credentials than necessary.

Why security sins are hard to fix

Security sins rarely stay contained within a single handler. Lust fixes often require app changes, platform controls, security review, and sometimes product changes to add confirmation or approval steps. Greed fixes usually spill into deployment and identity work: new service accounts, narrower filesystem mounts, tighter OAuth scope design, and review gates for high-risk tools.

That is why security cleanup so often feels slower than the original demo. You are not just rewriting a tool. You are redrawing the trust boundary around it.

Top comments (1)

CloakHQ • Mar 31

The Lust vs Greed split is a useful mental model - most security discussions treat access as one dimension, but separating reach from authority is the cleaner framing. You can have a model with very limited reach and still cause damage if it carries write credentials that are wider than the task needs.

The "task-shaped access" fix resonates. Working with browser automation, we ran into the same issue - sessions were over-credentialed because it was easier to give them broad access than to define exactly what each task needs. Narrowing that per-task was annoying to set up, but it's the only way blast radius is actually bounded.

Part 3 on operational sins is next - curious whether Sloth gets framed around error handling or logging gaps, because in practice those fail differently.