Wassim Chegham

Posted on Apr 8

How MCP Turns Your Messy Agents Into Governed Systems

#ai #mcp #agents #architecture

Imagine hiring a contractor and giving them your house keys, your credit card, and zero instructions. No scope of work. No limits on spending. No list of what they're allowed to touch.

That's what most agents do with their tools.

In post 1, we looked at how AI agents fail. In post 2, we fixed the knowledge problem with Agentic RAG. But there's another class of failure we haven't addressed yet: what happens when the agent does things in the real world, books flights, queries APIs, charges credit cards, with no structure, no boundaries, and no accountability.

Let's fix that.

The Problem: Agents With Unchecked Power

Our travel-planning agent (4-day hiking trip, budget-friendly, one fancy dinner) doesn't just think. It acts. It calls a flight search API. It hits a weather service. It queries a restaurant database. It might even book something.

And without structure, here's what actually happens:

Wrong tools, wrong parameters, no guardrails. The agent picks whatever tool seems vaguely right, passes whatever parameters it hallucinated, and hopes for the best. Maybe it calls the flight search API with a date in the wrong format. Maybe it sends your user's email to a weather endpoint that doesn't need it. Maybe it retries a booking call five times because it didn't get the response shape it expected and now you've booked five flights.

Tool calls are distributed system calls. This is the part people forget. Every tool invocation is a network call to an external system. They fail. They time out. They return partial results. They have rate limits. Treat them like simple function calls and you'll learn this the hard way.

No separation between reasoning and execution. When the agent's "thinking" and the tool's "doing" live in the same process, everything is tangled. You can't test them independently. You can't scale them independently. You can't have different teams own them. And you definitely can't enforce security at the boundary, because there is no boundary.

When tools aren't isolated, you can't govern them. Who called what? How many times? With what permissions? If every tool is just a function the agent can invoke freely, you have zero visibility and zero control.

The Solution: MCP (Model Context Protocol)

MCP (the Model Context Protocol) is a protocol layer that enables safe, consistent, and structured interactions between agents and external systems. Think of it as a middleman between the agent's reasoning and the actual tools. A translator that enforces rules.

The agent doesn't call tools directly. It sends a structured request through MCP, the MCP server executes the tool, and the result comes back through the same channel. Clean, auditable, controlled.

Here's what the architecture looks like:

The agent lives on one side. The tools live on the other. MCP is the contract in between.

Separation of Concerns

This is the key principle. With MCP:

Tool implementations run on separate MCP server processes. They're independent services with their own lifecycle, their own deployment, their own team.
Agent reasoning runs on the client side, in your application. The LLM thinks, plans, and decides what to do next.
The agent doesn't need to know HOW the tool works, just the interface. What inputs does it accept? What outputs does it return? That's the contract.

The agent sends a request: "Search flights from Seattle to Denver, departing March 15, returning March 19, budget class." The MCP server handles the rest, which API to call, how to authenticate, how to handle retries, how to format the response. The agent gets back a clean, typed result.

This is the same principle that made microservices work: define the interface, hide the implementation. Except now we're applying it to AI agents.

Modularity and Scalability

Because tools are decoupled from agent logic, you get real modularity:

Different teams own different tools. The flight booking team maintains the flight MCP server. The weather team maintains the weather MCP server. They develop, test, and deploy independently.
Plug-and-play tools. As long as a tool adheres to the MCP interface (input schemas and output schemas) you can swap it in. Want to switch from one weather provider to another? Change the MCP server implementation. The agent never knows the difference.
Scale independently. If your flight search tool is getting hammered during holiday season, scale up that MCP server. The weather server can stay small. You're not scaling a monolith, you're scaling the piece that needs it.

This is microservices architecture applied to agent tooling. And if you've already invested in container orchestration (Kubernetes, AKS), deploying MCP servers fits right into that world.

Security: Never Trust the Agent. Trust the Boundaries

Here's where it gets serious. When agents can call APIs, access data, or trigger side effects, they become part of your attack surface. And LLMs are not security primitives. They're probabilistic text generators that can be manipulated through prompt injection, confused by adversarial inputs, or simply make mistakes.

So you don't trust the agent. You trust the boundaries.

With MCP, security lives outside the model:

Tools run on servers you control.
Access is gated by identity, not by whether the agent "decided" it should have access.
Permissions are enforced before the agent acts, not after.

In our trip scenario, this means: the agent can suggest a flight booking. But only a tool running on an MCP server, authenticated with the right identity and permissions, can execute it. The agent proposes. The system disposes.

The rules are the same as everywhere else in security:

Least privilege. The agent gets access to exactly the tools it needs, nothing more.
Explicit contracts. Every tool interaction has a defined schema. No freestyle API calls.
Audited execution. Every tool call goes through MCP, so every call is logged.
No tool access through prompts. A clever prompt can't grant new capabilities. No hidden credentials baked into the context. No "the model decided to try something."

For identity and permissions management at scale, use something like Azure AI Gateway which sits in front of your MCP servers and handles authentication, authorization, and rate limiting centrally.

Enterprise Governance

For organizations running agents in production, MCP gives you something critical: a single choke point for enforcement.

All tool interactions go through MCP. That means you can:

Throttle to prevent tool overload. If the agent starts hammering an API, MCP can rate-limit it before the downstream service even notices.
Enforce organizational policies centrally. Data privacy rules? Usage limits? Compliance requirements? They live at the MCP layer, not scattered across every tool implementation.
Monitor everything. Every tool call, every response, every failure, captured in one place.

This is the difference between "we have an agent in production" and "we have an agent in production that we can actually govern."

Cost Control

Here's a practical one. Because each tool runs as its own MCP server, you can independently monitor and limit costs:

Track token usage per tool. Your flight search might consume 10x more tokens than your weather lookup. Now you can see that.
Monitor tool calls and retries. If a tool is failing and retrying repeatedly, you'll know and you can set limits.
Measure total execution time per tool, per request, per user. Budget accordingly.

When everything runs through a single agent process, cost attribution is a guessing game. With MCP, it's accounting.

What You Should Do Next

If you're building agents that interact with external systems, and most useful agents do, here's where to start:

Separate your agent's reasoning from tool execution. If your tools are functions inside your agent code, pull them out. Put them behind an MCP interface.
Define explicit contracts for every tool. Input schemas, output schemas, error types. If it's not in the contract, it doesn't exist.
Enforce security at the MCP boundary, not in the model. Identity, permissions, and audit logging belong in the infrastructure layer.
Deploy MCP servers as independent services. Use containers, use Kubernetes, use whatever your team already knows. The point is independent lifecycle and scaling.
Centralize governance through MCP. Rate limiting, policy enforcement, cost tracking, one layer, one place to look.
Never let the agent self-authorize. The agent suggests. The system (with proper identity and permissions) executes.

References

Deploying MCP Servers on Azure AKS with Workload Identity — Paul Yu's guide on running MCP servers in production on AKS.
How to Build Secure and Scalable Remote MCP Servers — GitHub's guide to building MCP servers that are secure by default.

What's Next?

MCP gives us safe, structured tool access. But how do we organize the agents themselves? When you have multi-step workflows (plan a trip, book flights, find restaurants, build an itinerary) how do you coordinate all of that without it collapsing into chaos?

In the next post, we'll cover four design patterns that make multi-step agents actually reliable you can start using today.

How are you managing tool access and governance in your agent systems? Share your thoughts in the comments below!

Top comments (1)

SidClaw • Apr 9

the choke point framing is solid. one thing the article doesn't cover: what happens at the enforcement boundary when the policy says "maybe."

rate limiting and deny rules handle the clear cases. but the interesting actions are the ones where the policy can't decide automatically — a database migration that looks risky, an email to a new recipient. those need a human to see the full context and approve or deny before the action runs.

audit logging tells you what happened. approval workflow decides whether it should happen.