As AI engineers, we often start by giving a massive LLM (like GPT-4) a giant prompt and a long list of Python functions (tools) it can call. This monolithic approach works for simple scripts, but it quickly becomes expensive, slow, and a security risk when the LLM has direct access to sensitive resources like email passwords or file systems.
To move beyond prototypes, we need to separate the "thinking" from the "doing." In this chapter of my open‑source tutorial, we rebuild an agentic assistant using two powerful technologies:
- LangGraph – for routing logic across specialized nodes.
- Model Context Protocol (MCP) – for secure, decoupled tool execution.
The result is a production‑ready, decoupled AI system that is safer, faster, and more reusable.
👉 GitHub Repository: Agentic-AI-Tutorial
🏗️ Architecture: Brain vs. Hands
Instead of a single monolithic agent, our application consists of two distinct parts communicating over Server‑Sent Events (SSE).
1. The Hands: FastMCP Server
All physical tools (math logic, Python SMTP/IMAP email operations) live in a standalone ASGI server running on port 8000. Using the Model Context Protocol (an open standard), we expose these functions securely.
# McpServer/tools/weather.py
from server import mcp
@mcp.tool()
def get_temperature(city: str) -> str:
"""Get the current temperature for a given city."""
return f"The weather in {city} is 72°F."
Thanks to MCP, the server automatically reads the docstrings and type hints, generating a standardized JSON schema. Crucially, the LangGraph agent never possesses your passwords nor directly executes this code – it only sends requests via the protocol.
2. The Brain: LangGraph Orchestrator
On the other side, we built a StateGraph with distinct nodes to handle user requests efficiently.
router.py(The Fast Path)
Uses a fast, cheap LLM to classify the user's intent:math,email, orconversation.execute.py(The Heavy Lifter)
If the router chooses a tool‑based intent, this node takes over. It uses LangChain’screate_tool_calling_agentand dynamically binds to the remote MCP server.summarize.py(The Formatter)
Takes the raw JSON output from the MCP server and uses an LLM to synthesize a polite, conversational response for the user.conversation.py(The Chitchat Fallback)
If the user just says “Hello,” we skip all heavy execution. This node feeds the conversation history directly to the LLM, saving tokens and time.
🚀 Why This Matters
Security & Isolation
The MCP server acts as a secure boundary. You can host the LangGraph agent in the cloud while running the MCP server on your local corporate intranet to access private databases – all without exposing credentials.
Reusability
Once you build an MCP server (like our Mail/Math server), you can plug it into any MCP‑compatible client. The same tools work in LangChain, Claude Desktop, Cursor, or your own custom UIs – no rewriting needed.
Run It 100% Offline with Ollama
Don’t want to pay for OpenAI API keys? Standardising with LangChain and MCP makes swapping LLMs trivial. You can run the entire workflow locally and for free using Ollama!
Simply update the agent nodes (e.g., router.py, execute.py) from:
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-3.5-turbo")
to:
from langchain_ollama import ChatOllama
llm = ChatOllama(model="llama3.1", temperature=0)
Because LangChain provides unified interfaces, the node routing and tool calling continue to work seamlessly with your local Llama 3 model.
💻 Try It Yourself!
All code, diagrams, and step‑by‑step instructions are available in Chapter 5 of my open‑source tutorial repository:
Clone it, spin up the FastMCP server, and watch the LangGraph nodes gracefully orchestrate tool execution!
📸 Demo: From User Request to Delivered Email
Here’s a real example of the agent in action, showing the entire flow from a user asking to send an email to the confirmation that it actually arrived.
1. The User Request & Agent's Execution
In the screenshot below, you can see the user submitting a request: "send email to Gamal saying hello from the agent". The LangGraph router correctly classifies this as an email intent, and the execute node invokes the remote MCP server's email tool. The agent then reports back that the email was sent successfully.
2. Proof of Execution
To verify that the MCP server actually performed the task—and that the agent wasn't just "hallucinating" success—we can check the real destination. The following screenshot shows the actual "Gamal" inbox with the email delivered exactly as requested. This confirms the secure, end-to-end functionality of the decoupled architecture.
This walkthrough visually confirms that the brain (LangGraph agent) correctly interprets the request and delegates to the hands (MCP server), which securely performs the action without exposing any credentials or internal logic to the agent itself.
🧠 What’s Next?
What tools are you planning to build for your MCP servers? Let me know in the comments below – I’d love to hear about your ideas and see what you create!
Follow me for more tutorials on production‑ready AI systems.


Top comments (0)