Jangwook Kim

Posted on May 7 • Originally published at effloow.com

LangGraph + MCP: Build a Supervisor Multi-Agent System

#langgraph #mcp #multiagent #python

Why This Pattern Matters

Most LangGraph tutorials stop at single agents. A single agent that does research, writes code, and formats a report is juggling three jobs — and as the task list grows, the prompt grows with it. The supervisor pattern solves this: one orchestrator LLM does nothing but decide which specialist gets the next task, while each specialist operates with a focused, minimal prompt.

Add MCP (Model Context Protocol) into that picture and you get a second level of separation. Instead of hardcoding tools into each agent, you serve them from a live HTTP/SSE endpoint. Update, version, or swap a tool server without touching your agent code. That's the architecture this guide builds.

Effloow Lab verified the full package chain on Python 3.12 macOS: langchain-mcp-adapters==0.2.2, langgraph-supervisor==0.0.31, mcp==1.27.0, and langgraph==1.1.10 install cleanly together. API surface checks confirmed all import paths and constructor signatures documented here. See data/lab-runs/langgraph-mcp-poc.md for full PoC notes.

Core Concepts

The Three Layers

LangGraph provides the state graph runtime. Every agent is a StateGraph compiled with a checkpointer — state persists across steps via a MemorySaver (in-memory) or a production store like PostgreSQL.

langchain-mcp-adapters is the bridge package. It converts MCP tool schemas into langchain_core-compatible BaseTool objects that any LangGraph node can call. MultiServerMCPClient manages connections to one or many MCP servers over stdio, sse, streamable_http, or websocket transports.

langgraph-supervisor provides create_supervisor() — a factory that wraps specialist agents into a routing graph. The supervisor LLM receives a task, picks which agent to call using a handoff tool, and that agent runs until it returns control. The cycle repeats until the supervisor decides the job is done.

Why MCP Over Hardcoded Tools?

With hardcoded tools, every tool change means redeploying your agent. With an MCP server, you deploy a tool update once and every connected agent picks it up immediately. The langchain-mcp-adapters package converts MCP's JSON-RPC tool schema into LangChain's BaseTool format transparently — your agent code never knows the difference.

FreeCodeCamp's 2026 full book on LangGraph+MCP+A2A pinned mcp==1.26.0 and langgraph==1.1.0, signaling these version families are stable for production builds. The current latest (mcp==1.27.0, langgraph==1.1.10) are API-compatible.

Prerequisites

Python 3.11+
pip install langchain-mcp-adapters==0.2.2 langgraph-supervisor==0.0.31
pip install langchain-openai  # or langchain-anthropic for your LLM

The mcp SDK (1.27.0) installs automatically as a dependency of langchain-mcp-adapters. The langgraph package (1.1.10) may already be present if you have langchain installed.

You'll also need an API key from OpenAI or Anthropic for the LLM calls.

Step 1: Build an MCP Tool Server with FastMCP

The MCP SDK ships FastMCP, a decorator-based server builder. Create tool_server.py:

from mcp.server.fastmcp import FastMCP

mcp_app = FastMCP(
    "research-tools",
    host="127.0.0.1",
    port=8001,
    streamable_http_path="/mcp",
    sse_path="/sse",
)

@mcp_app.tool()
def web_search(query: str) -> str:
    """Search the web and return a summary of results."""
    # In production: call a real search API (Serper, Tavily, Brave, etc.)
    return f"Search results for '{query}': [wire up your search API here]"

@mcp_app.tool()
def read_file(path: str) -> str:
    """Read the contents of a local file by absolute path."""
    try:
        with open(path) as f:
            return f.read()
    except FileNotFoundError:
        return f"File not found: {path}"

if __name__ == "__main__":
    mcp_app.run(transport="streamable-http")

Start it with:

python tool_server.py
# Serving at http://127.0.0.1:8001/mcp

FastMCP automatically generates the MCP JSON-RPC schema from your function signatures and docstrings. No boilerplate required.

Step 2: Create Specialist Agents With MCP Tools

Specialist agents connect to the tool server via MultiServerMCPClient. Because MCP connections are async context managers, use async with:

import asyncio
from langchain_mcp_adapters.client import MultiServerMCPClient
from langgraph.prebuilt import create_react_agent
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model="gpt-4.1-mini")

async def build_research_agent():
    """Return a compiled ReAct agent backed by MCP research tools."""
    async with MultiServerMCPClient(
        {
            "research": {
                "transport": "streamable_http",
                "url": "http://127.0.0.1:8001/mcp",
            }
        }
    ) as client:
        tools = client.get_tools()
        agent = create_react_agent(
            model=model,
            tools=tools,
            name="research_agent",
            prompt=(
                "You are a research specialist. Use web_search and read_file "
                "to gather information. Return a concise factual summary."
            ),
        )
    return agent

The name parameter matters: create_supervisor() uses it to build the handoff tools that route tasks between agents.

For a second specialist — a code writer that generates (but does not run) Python — pass LangChain tools directly:

from langchain_core.tools import tool

@tool
def write_python_module(name: str, description: str, spec: str) -> str:
    """Generate a Python module stub given a name, description, and specification."""
    return (
        f'"""\n{description}\n"""\n\n'
        f"# Specification:\n# {spec}\n\n"
        f"def {name}():\n    raise NotImplementedError"
    )

code_agent = create_react_agent(
    model=model,
    tools=[write_python_module],
    name="code_agent",
    prompt=(
        "You are a coding specialist. Write Python module stubs and code "
        "structure based on requirements from the research agent."
    ),
)

Step 3: Wire Up the Supervisor

create_supervisor() returns an uncompiled StateGraph. Call .compile() with a checkpointer to enable state persistence across turns:

from langgraph_supervisor import create_supervisor
from langgraph.checkpoint.memory import MemorySaver
from langchain_openai import ChatOpenAI

supervisor_model = ChatOpenAI(model="gpt-4.1")
checkpointer = MemorySaver()

workflow = create_supervisor(
    agents=[research_agent, code_agent],
    model=supervisor_model,
    prompt=(
        "You are a task coordinator. For information retrieval tasks, delegate to "
        "research_agent. For code generation tasks, delegate to code_agent. "
        "Once both have reported back, synthesize the result."
    ),
    output_mode="last_message",
    supervisor_name="coordinator",
)

app = workflow.compile(checkpointer=checkpointer)

output_mode="last_message" returns only the final assistant message rather than the full turn history — cleaner for API consumers. Switch to "full_history" if you need the complete trace.

Step 4: Run a Multi-Step Task

LangGraph uses thread_id in the config dict to key checkpoint state, enabling multi-turn conversations where each agent remembers its prior context:

async def run_task(task: str, thread_id: str = "session-1"):
    config = {"configurable": {"thread_id": thread_id}}
    result = await app.ainvoke(
        {"messages": [{"role": "user", "content": task}]},
        config=config,
    )
    return result["messages"][-1].content

async def main():
    task = (
        "1. Search for Python asyncio best practices from 2026. "
        "2. Generate a Python module stub that demonstrates async patterns. "
        "3. Summarize what you found and what the generated code does."
    )
    answer = await run_task(task, thread_id="demo-001")
    print(answer)

asyncio.run(main())

The supervisor breaks this into subtasks automatically: it routes step 1 to research_agent, step 2 to code_agent, then synthesizes the result itself for step 3. Because state is checkpointed, you can pause, resume, or inspect intermediate state at any point.

Step 5: Connect to Multiple MCP Servers

MultiServerMCPClient accepts a dictionary of named server connections. Each key becomes a namespace prefix for the tools it provides:

async with MultiServerMCPClient(
    {
        "research": {
            "transport": "streamable_http",
            "url": "http://127.0.0.1:8001/mcp",
        },
        "database": {
            "transport": "sse",
            "url": "http://127.0.0.1:8002/sse",
        },
        "filesystem": {
            "transport": "stdio",
            "command": "python",
            "args": ["filesystem_server.py"],
            "env": {"FS_ROOT": "/tmp/sandbox"},
        },
    }
) as client:
    all_tools = client.get_tools()
    # all_tools contains BaseTool objects from all three servers

stdio transport is ideal for local servers that shouldn't expose a network port. streamable_http is preferred for production deployments — it supports stateless servers and works in serverless environments where SSE's long-lived connection model is problematic.

Checkpointing in Production

MemorySaver stores state in process memory — useful for development but lost on restart. For production, replace it with a persistent backend:

# PostgreSQL (pip install langgraph-checkpoint-postgres)
from langgraph.checkpoint.postgres import PostgresSaver

with PostgresSaver.from_conn_string("postgresql://user:pass@localhost/db") as cp:
    app = workflow.compile(checkpointer=cp)

# Redis (pip install langgraph-checkpoint-redis)
from langgraph.checkpoint.redis import RedisSaver

cp = RedisSaver.from_conn_string("redis://localhost:6379")
app = workflow.compile(checkpointer=cp)

With either backend, conversation state survives process restarts. The thread_id in your config dict is the only key you need to resume exactly where a session left off.

Common Mistakes

Forgetting async with for MultiServerMCPClient. The client opens connections on entry and closes them on exit. Skipping the context manager leaves connections open and causes tool calls to fail silently.

Giving agents the same name. create_supervisor() uses agent.name to generate handoff tools. Duplicate names cause a silent collision — one agent never gets routed to.

Using stdio transport in a server process. stdio forks a subprocess and communicates over stdin/stdout, which conflicts with the parent process's I/O in production deployments. Use sse or streamable_http for any server-side agent.

Not compiling with a checkpointer. create_supervisor() returns an uncompiled StateGraph. If you call ainvoke on it directly, LangGraph raises a runtime error. Always call .compile(checkpointer=...) before invoking.

Giving the supervisor too much work. The supervisor LLM should decide routing and synthesize final answers — nothing else. If you give it tools that produce side effects, it may try to run them before routing, breaking the clean separation the pattern depends on.

FAQ

Q: Does `langchain-mcp-adapters` support authentication for remote MCP servers?

Yes. SSEConnection and StreamableHttpConnection accept a headers dict. Pass an Authorization: Bearer <token> header the same way you would for any HTTP client. The stdio transport uses environment variables for credentials instead.

Q: How is this different from the older LangGraph tutorial on Effloow?

The earlier LangGraph article predates the langchain-mcp-adapters package. It covers single-agent ReAct loops with hardcoded tools. This guide adds the supervisor layer for multi-agent routing and replaces hardcoded tools with live MCP server connections — both significant architectural shifts that change how you deploy and maintain agents in production.

Q: Can I use Anthropic Claude instead of OpenAI for the agents?

Yes. Replace ChatOpenAI with ChatAnthropic from langchain-anthropic. The create_react_agent and create_supervisor APIs are model-agnostic — any LangChain chat model works.

from langchain_anthropic import ChatAnthropic
model = ChatAnthropic(model="claude-sonnet-4-6")

Q: What happens when a specialist agent fails mid-task?

With a persistent checkpointer, the graph state at the last successful checkpoint is preserved. You can re-invoke the same thread_id to resume from the failure point rather than restarting the full task. This is the primary reason to prefer PostgreSQL or Redis over MemorySaver in production.

Q: Is there a JavaScript/TypeScript equivalent?

Yes. @langchain/mcp-adapters on npm mirrors the Python API, including MultiServerMCPClient. The @langchain/langgraph-supervisor npm package provides createSupervisor(). The pattern and transport options are the same across both ecosystems.

Key Takeaways

Bottom Line

The LangGraph + MCP supervisor stack is production-ready in 2026. The package chain — langchain-mcp-adapters==0.2.2, langgraph-supervisor==0.0.31, mcp==1.27.0 — installs cleanly, the API is stable, and the separation of routing (supervisor), execution (specialist agents), and tool serving (MCP servers) keeps each layer independently deployable. Start with MemorySaver and streamable_http transport, then graduate to a persistent checkpointer when you need multi-session state.

The pattern scales naturally: add a third specialist by passing another agent to create_supervisor(). Add a new capability by deploying a new MCP tool server and connecting it to the relevant agent's MultiServerMCPClient. Neither change touches the supervisor or any other specialist agent.

For teams already using LangGraph, the upgrade path is additive — langchain-mcp-adapters wraps your existing tool functions without breaking anything. For teams starting fresh, the FreeCodeCamp full book on LangGraph+MCP+A2A is the most comprehensive public resource, covering this pattern extended all the way to cross-framework A2A delegation.

DEV Community

LangGraph + MCP: Build a Supervisor Multi-Agent System

Why This Pattern Matters

Core Concepts

The Three Layers

Why MCP Over Hardcoded Tools?

Prerequisites

Step 1: Build an MCP Tool Server with FastMCP

Step 2: Create Specialist Agents With MCP Tools

Step 3: Wire Up the Supervisor

Step 4: Run a Multi-Step Task

Step 5: Connect to Multiple MCP Servers

Checkpointing in Production

Common Mistakes

FAQ

Q: Does `langchain-mcp-adapters` support authentication for remote MCP servers?

Q: How is this different from the older LangGraph tutorial on Effloow?

Q: Can I use Anthropic Claude instead of OpenAI for the agents?

Q: What happens when a specialist agent fails mid-task?

Q: Is there a JavaScript/TypeScript equivalent?

Key Takeaways

Top comments (0)

Why This Pattern Matters

Core Concepts

The Three Layers

Why MCP Over Hardcoded Tools?

Prerequisites

Step 1: Build an MCP Tool Server with FastMCP

Step 2: Create Specialist Agents With MCP Tools

Step 3: Wire Up the Supervisor

Step 4: Run a Multi-Step Task

Step 5: Connect to Multiple MCP Servers

Checkpointing in Production

Common Mistakes

FAQ

Q: Does langchain-mcp-adapters support authentication for remote MCP servers?

Q: How is this different from the older LangGraph tutorial on Effloow?

Q: Can I use Anthropic Claude instead of OpenAI for the agents?

Q: What happens when a specialist agent fails mid-task?

Q: Is there a JavaScript/TypeScript equivalent?

Key Takeaways

Q: Does `langchain-mcp-adapters` support authentication for remote MCP servers?