DEV Community

Cover image for Building a Production-Ready Agentic AI System with LangGraph and MCP
zkaria gamal
zkaria gamal

Posted on

Building a Production-Ready Agentic AI System with LangGraph and MCP

As AI engineers, we often start by giving a massive LLM (like GPT-4) a giant prompt and a long list of Python functions (tools) it can call. This monolithic approach works for simple scripts, but it quickly becomes expensive, slow, and a security risk when the LLM has direct access to sensitive resources like email passwords or file systems.

To move beyond prototypes, we need to separate the "thinking" from the "doing." In this chapter of my open‑source tutorial, we rebuild an agentic assistant using two powerful technologies:

  • LangGraph – for routing logic across specialized nodes.
  • Model Context Protocol (MCP) – for secure, decoupled tool execution.

The result is a production‑ready, decoupled AI system that is safer, faster, and more reusable.

👉 GitHub Repository: Agentic-AI-Tutorial


🏗️ Architecture: Brain vs. Hands

Instead of a single monolithic agent, our application consists of two distinct parts communicating over Server‑Sent Events (SSE).

1. The Hands: FastMCP Server

All physical tools (math logic, Python SMTP/IMAP email operations) live in a standalone ASGI server running on port 8000. Using the Model Context Protocol (an open standard), we expose these functions securely.

# McpServer/tools/weather.py
from server import mcp

@mcp.tool()
def get_temperature(city: str) -> str:
    """Get the current temperature for a given city."""
    return f"The weather in {city} is 72°F."
Enter fullscreen mode Exit fullscreen mode

Thanks to MCP, the server automatically reads the docstrings and type hints, generating a standardized JSON schema. Crucially, the LangGraph agent never possesses your passwords nor directly executes this code – it only sends requests via the protocol.

2. The Brain: LangGraph Orchestrator

On the other side, we built a StateGraph with distinct nodes to handle user requests efficiently.

  • router.py (The Fast Path)

    Uses a fast, cheap LLM to classify the user's intent: math, email, or conversation.

  • execute.py (The Heavy Lifter)

    If the router chooses a tool‑based intent, this node takes over. It uses LangChain’s create_tool_calling_agent and dynamically binds to the remote MCP server.

  • summarize.py (The Formatter)

    Takes the raw JSON output from the MCP server and uses an LLM to synthesize a polite, conversational response for the user.

  • conversation.py (The Chitchat Fallback)

    If the user just says “Hello,” we skip all heavy execution. This node feeds the conversation history directly to the LLM, saving tokens and time.


🚀 Why This Matters

Security & Isolation

The MCP server acts as a secure boundary. You can host the LangGraph agent in the cloud while running the MCP server on your local corporate intranet to access private databases – all without exposing credentials.

Reusability

Once you build an MCP server (like our Mail/Math server), you can plug it into any MCP‑compatible client. The same tools work in LangChain, Claude Desktop, Cursor, or your own custom UIs – no rewriting needed.

Run It 100% Offline with Ollama

Don’t want to pay for OpenAI API keys? Standardising with LangChain and MCP makes swapping LLMs trivial. You can run the entire workflow locally and for free using Ollama!

Simply update the agent nodes (e.g., router.py, execute.py) from:

from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-3.5-turbo")
Enter fullscreen mode Exit fullscreen mode

to:

from langchain_ollama import ChatOllama
llm = ChatOllama(model="llama3.1", temperature=0)
Enter fullscreen mode Exit fullscreen mode

Because LangChain provides unified interfaces, the node routing and tool calling continue to work seamlessly with your local Llama 3 model.


💻 Try It Yourself!

All code, diagrams, and step‑by‑step instructions are available in Chapter 5 of my open‑source tutorial repository:

👉 GitHub: Agentic-AI-Tutorial

Clone it, spin up the FastMCP server, and watch the LangGraph nodes gracefully orchestrate tool execution!


📸 Demo: From User Request to Delivered Email

Here’s a real example of the agent in action, showing the entire flow from a user asking to send an email to the confirmation that it actually arrived.

1. The User Request & Agent's Execution

In the screenshot below, you can see the user submitting a request: "send email to Gamal saying hello from the agent". The LangGraph router correctly classifies this as an email intent, and the execute node invokes the remote MCP server's email tool. The agent then reports back that the email was sent successfully.

User requesting an email and agent confirming it was sent

2. Proof of Execution

To verify that the MCP server actually performed the task—and that the agent wasn't just "hallucinating" success—we can check the real destination. The following screenshot shows the actual "Gamal" inbox with the email delivered exactly as requested. This confirms the secure, end-to-end functionality of the decoupled architecture.

The actual email appearing in the recipient's inbox

This walkthrough visually confirms that the brain (LangGraph agent) correctly interprets the request and delegates to the hands (MCP server), which securely performs the action without exposing any credentials or internal logic to the agent itself.


🧠 What’s Next?

What tools are you planning to build for your MCP servers? Let me know in the comments below – I’d love to hear about your ideas and see what you create!


Follow me for more tutorials on production‑ready AI systems.

Top comments (0)