It usually starts with something that feels harmless.
You give an AI agent access to a few tools. Maybe it can read internal tickets, check a database, and send Slack messages. You wire things up, test a few flows, and everything works.
Then someone asks a simple question:
“What stops this agent from doing something it shouldn’t?”
That’s where things get uncomfortable.
The “Lethal Trifecta” (Why This Gets Risky Fast)
There’s a concept from recent security research that’s been getting a lot of attention.
It’s sometimes called the “lethal trifecta.”
An AI agent becomes dangerous when it combines three capabilities:
- Access to private data
- Exposure to untrusted input
- Ability to take external actions
Each of these is fine on its own.
Together, they’re a problem.
Imagine this:
Your agent reads internal support tickets.
It also processes external content, like GitHub issues.
And it can send messages to Slack.
Now someone posts a malicious prompt inside a public GitHub issue.
The agent reads it, follows the instructions, and sends sensitive internal data to an external channel.
No exploit. No broken auth. Just… the system doing exactly what it was allowed to do.
This isn’t theoretical; recent security research has already demonstrated variations of this in real systems.
Where MCP Fits (and Where It Doesn’t)
To be fair, the Model Context Protocol (MCP) solves a real problem.
It standardizes how agents talk to tools.
Instead of building custom integrations for every system, you get a consistent interface. That’s a big win for developer productivity.
But MCP was never meant to be a security framework.
It’s a protocol, not a control plane.
And that distinction matters a lot in production.
This is the part most teams miss: MCP standardizes communication, but the gateway layer is what actually enforces governance and security.
What MCP Deliberately Doesn’t Handle
Once you start looking closely, the gaps become obvious.
MCP defines how communication happens. It doesn’t define what should be allowed.
Here’s what it leaves to you:
No built-in authentication
There’s no default mechanism enforcing identity between agents and tools. You’re responsible for implementing and managing that layer yourself.
No access control model
By default, any agent can discover and call any registered tool. There’s no concept of scoped visibility unless you build it.
No observability
Direct MCP connections give you very little insight into what’s actually happening. You don’t get a clear trace of agent behavior across tools.
No guardrails
Tools execute with whatever permissions they have. MCP doesn’t inspect inputs or outputs for risky behavior.
None of this is a flaw. It’s a design choice.
But it means MCP alone is not enough once you move beyond demos.
The Real Threat Model for Agent Systems
Agent systems introduce risks that don’t exist in traditional APIs.
If you treat them the same way, you miss what actually matters.
1. Prompt injection via tool responses
This one catches teams off guard.
You secure your prompts. You validate inputs. Everything looks fine.
But the attack comes from the tool output.
A Jira ticket. A web page. A GitHub issue.
If that content contains instructions, the agent may follow them as if they were part of the original task.
That’s how data gets exfiltrated without breaking any rules.
2. Tool permission creep
This usually starts with good intentions.
“Let’s just give the agent access to everything it might need.”
A few weeks later, it has access to 40 or 50 tools.
Most of them aren’t used.
But every unused tool increases your blast radius.
You don’t get breached because of what you use.
You get breached because of what you forgot was there.
3. The sequence problem
Two actions can be safe individually and dangerous together.
- Read internal data → safe
- Send data externally → safe
Combine them:
- Read internal data → send externally → not safe
Traditional systems struggle with this because they evaluate actions in isolation.
Agent systems execute sequences. That’s where the risk lives.
4. Shadow MCP servers
This one is more of an organizational issue.
Developers spin up their own MCP servers to move faster.
No review. No governance. No centralized visibility.
Now you have tools in your system that your security team doesn’t even know exist.
And agents can talk to them.
This is exactly where a gateway layer becomes necessary.
MCP defines how tools are called.
A gateway defines what is allowed, monitored, and enforced.
Without that layer, you’re relying on application logic for security, and that doesn’t scale.
What a Production-Ready Security Model Looks Like
Once you accept that MCP doesn’t handle security, the next question is:
What does a secure setup actually look like?
At a high level, you need a layer that enforces control, visibility, and policy across every tool interaction.
Let’s break down the key controls.
Least-privilege tool access
Agents shouldn’t discover tools and then get blocked.
They shouldn’t see tools they’re not allowed to use in the first place.
This is a subtle but important difference.
In practice, this means each agent interacts with a filtered view of the tool registry.
This is exactly how TrueFoundry implements least-privilege access in production, using Virtual MCP Servers to control what each agent can even see.
In production, secure agent systems usually expose a filtered tool registry instead of giving agents global visibility into every MCP server.
Per-agent RBAC
Not all agents are equal.
A compliance agent and a customer support agent should operate in completely different scopes.
That separation should be enforced at the infrastructure level, not buried inside application logic.
Otherwise, it becomes fragile and hard to audit.
In mature deployments, security policies are enforced centrally instead of being scattered across application code.
Guardrails on both paths
Most teams think about validating inputs.
Fewer think about validating outputs.
You need both.
- Inspect inputs before they reach a tool (to prevent prompt injection)
- Inspect outputs before they reach the agent (to prevent data exfiltration)
This creates a controlled boundary around every tool call.
Human-in-the-loop gates
Some actions shouldn’t be fully automated.
Deleting data. Sending external communications. Triggering financial operations.
For these, you need approval steps.
A secure system doesn’t assume agents are always right. It gives humans the ability to intervene when it matters.
Immutable audit trails
When something goes wrong, you need answers.
Not guesses.
You need to know:
- Which agent made the call
- Which tool it used
- What parameters were passed
- What the tool returned
- What happened next
Without this, debugging becomes impossible and compliance becomes a nightmare.
Deployment: Where Does Your Data Actually Go?
This is the part that security teams care about immediately.
In many setups, requests flow through third-party infrastructure.
That means your data leaves your environment.
For some teams, that’s acceptable.
For many enterprises, it isn’t.
A different approach is to run everything inside your own infrastructure.
Platforms like TrueFoundry support deployment in your VPC, on-prem, or even air-gapped environments, so data never leaves your domain.
In practice, this translates into infrastructure that’s already running at a serious scale.
TrueFoundry is recognized in the 2026 Gartner® Market Guide for AI Gateways and handles production-scale workloads, processing 10B+ requests per month while maintaining 350+ RPS on a single vCPU with sub-3ms latency.
It’s compliant with SOC 2, HIPAA, GDPR, ITAR, and the EU AI Act and is trusted by enterprises including Siemens Healthineers, NVIDIA, Resmed, and Automation Anywhere.
A Practical Security Checklist (Before You Ship)
If you’re moving agents to production, this is the checklist I’d actually use:
- [ ] Are all tool interactions going through a centralized MCP gateway?
- [ ] Does each agent only see the tools it’s allowed to use?
- [ ] Are tool inputs and outputs inspected for risky behavior?
- [ ] Do high-risk actions require human approval?
- [ ] Can you trace every agent action end-to-end?
- [ ] Is everything running inside your own infrastructure (not a third-party SaaS)?
If you answer “no” to more than one of these, you’re not production-ready yet.
The Real Takeaway
MCP is a solid foundation.
It makes tool integration cleaner, faster, and more consistent.
But it doesn’t make your system secure.
Security comes from the layer that controls everything around that interaction.
That’s the difference most teams miss.
They adopt MCP, see things working, and assume they’re done.
In reality, they’ve only solved the communication problem, not the control problem.
MCP standardizes communication.
The gateway standardizes control.
Final Thoughts
AI agents change how systems behave.
They don’t just respond to requests. They take actions, make decisions, and interact with multiple systems in sequence.
That’s powerful.
But it also means the risk model is different.
If you treat agents like simple APIs, you’ll miss the failure modes that actually matter.
The teams that get this right don’t just add tools; they add structure around how those tools are used.
If you’re starting to think seriously about security, that’s a good sign. It usually means your system is moving from demo to something real.
If you want to explore what a unified control plane for models, tools, and agents looks like in practice, you can try TrueFoundry free, no credit card required, and deploy it in your own cloud in under 10 minutes.
| Thanks for reading! 🙏🏻 I hope you found this useful ✅ Please react and follow for more 😍 Made with 💙 by Hadil Ben Abdallah |
|
|---|





Top comments (0)