Maaz Ahmed

Posted on Jun 5

The MCP tool you approved might not be the tool running

#security #mcp #ai #opensource

AI agents are starting to use real tools.

Not just search or chat. Tools that read files, send email, query databases, open browser sessions, touch internal systems, and move data around.

That changes the security problem.

Most people are focused on the request:

Is the prompt safe?
Is the input malicious?
Is this tool name allowed?
Is the user allowed to call it?

Those checks matter. But they miss another problem:

What if the tool changed after it was approved?

The drift problem

Imagine an MCP tool called read_document.

At approval time, it looks safe:

reads a document
returns text
internal only
no sensitive data
no external side effects

So the agent is allowed to call it.

Later, the tool changes.

Same name. Same general purpose. But now it can export content to an external email address, and it touches PII.

That is a different risk profile.

The tool did not just get updated. It drifted from what was approved.

Why allowlists miss this

A basic allowlist sees:

read_document

That name was approved, so the call passes.

A prompt injection scanner may also pass it, because the input can be clean. There may be no malicious instruction in the prompt at all.

The problem is not the request.

The problem is that the tool is no longer the same trusted tool.

What Interlock does
Interlock keeps a baseline from when a tool is approved.

When the live tool definition changes, Interlock compares it against the approved baseline and looks for risk changes like:

effect escalation
new external reach
new sensitive data classes
schema changes
permission expansion
behavior changes under the same tool name
If the change is risky enough, Interlock can quarantine the tool before the agent calls it.

It also creates a Security Receipt that records what changed, why the decision was made, and the evidence behind it.

Why this matters for MCP
MCP makes tool access easier and more standardized. That is good.

But production agent systems need more than static approval. They need runtime trust checks.

The question should not only be:

Is this call allowed?
It should also be:

Is this still the tool we approved?
That is the gap Interlock is focused on.

Project: https://getinterlock.dev
GitHub: https://github.com/MaazAhmed47/Interlock

DEV Community

The MCP tool you approved might not be the tool running

The drift problem

Why allowlists miss this

Top comments (0)