DEV Community

Tech_Nuggets
Tech_Nuggets

Posted on

The Model Context Protocol (MCP): what it is and how to build a server

The Model Context Protocol (MCP): what it is and how to build a server

Your team's LLM-powered application talks to a search index through one custom integration, a code repository through another, a Postgres database through a chain of LangChain tools, and a file system through raw Python I/O calls. Every new data source means writing a new integration. Every integration uses a different authentication model and returns data in a different shape. The LLM application is tightly coupled to every backend it touches, and swapping one out requires changing the application code directly.

The Model Context Protocol (MCP) exists to replace this bespoke plumbing with a single, standardized interface. Think of it as a USB-C port for LLM applications: one connector shape, one protocol, and any compatible server can plug into any compatible client without custom wiring.

Why a standard protocol matters

LLM-powered tools have exploded in capability over the past two years, but the integration story has not kept up. Each AI application (IDE assistant, chat client, agent framework) historically built its own connectors for databases, APIs, document stores, and code repositories. There was no shared contract. If you wanted to use a specific code search tool with two different AI assistants, you needed two separate integrations.

MCP borrows its design philosophy from the Language Server Protocol (LSP), which standardized how code editors talk to language analyzers. Before LSP, each editor had its own plugin for each language. After LSP, one language server worked with every editor. MCP aims to do the same for AI tools and the data sources they need.

The protocol is an open standard, originally created at Anthropic and published under the MIT license. The specification reached stable at version 2025-11-25, and the Python SDK (mcp on PyPI) is at 1.27.2 as of May 2026. A 2.0.0 alpha was published in June 2026 with an updated transport layer.

How MCP works

MCP uses JSON-RPC 2.0 as its message format. A client (the AI application) connects to a server (a service that provides context) over one of three transport types:

  • stdio: the client spawns the server as a child process and communicates over stdin/stdout. Best for local, single-user setups.
  • SSE (Server-Sent Events): the server runs as an HTTP endpoint, the client connects over HTTP. Works across machines.
  • Streamable HTTP: a newer transport that allows bidirectional streaming over HTTP. Added in the 2025-11-25 spec.

Here is the conceptual architecture:

flowchart LR
    subgraph Client["Client (AI App)"]
        A["Host<br/>IDE / Chat / Agent"]
        B["MCP Client<br/>Protocol handler"]
    end
    subgraph Server["MCP Server"]
        C["MCP Server<br/>Protocol handler"]
        D["Resources<br/>context data"]
        E["Tools<br/>executable functions"]
        F["Prompts<br/>templated workflows"]
    end
    A <--> B
    B <-->|JSON-RPC 2.0<br/>stdio / SSE / HTTP| C
    C --> D
    C --> E
    C --> F
Enter fullscreen mode Exit fullscreen mode

Every MCP session begins with a capability negotiation handshake. The client announces what features it supports (sampling, roots, elicitation). The server announces what features it offers (resources, tools, prompts). Both sides agree on a feature set before any data exchange happens.

Server primitives

Servers offer three main categories of functionality:

Resources expose data to the LLM. Think of them as GET endpoints in a REST API. A resource has a URI and returns content in a structured format. Example: file:///logs/2026-06-01.txt returns the content of that log file. Resources are how the LLM loads context.

Tools are functions the LLM can invoke. Think of them as POST endpoints. A tool has a name, a description, and an input schema (JSON Schema). The LLM can call a tool to execute code, query a database, or trigger an external action. Unlike resources, tools are invoked on demand.

Prompts are reusable templates for LLM interactions. A prompt defines a message template with parameter slots. The client can populate the template and present the result to the user as a pre-built interaction.

Client primitives

Clients can also offer features to servers:

  • Sampling: the server can request the client to generate an LLM response, enabling agentic loops where one model delegates to another.
  • Roots: the server can request information about filesystem or URI boundaries, so it knows where it is allowed to operate.
  • Elicitation: the server can request additional information from the user through the client's UI.

Building an MCP server in Python

The mcp package (v1.27.2) provides a high-level API called FastMCP that makes building a server straightforward. Here is a complete server that exposes a weather tool and a greeting resource:

from mcp.server.fastmcp import FastMCP

# Create an MCP server
mcp = FastMCP("Weather Demo")

# Add a tool: get weather for a city
@mcp.tool()
def get_weather(city: str, units: str = "celsius") -> str:
    """Get the current weather for a city."""
    # In production, call a real weather API here
    return f"Weather in {city}: 22 degrees {units}, partly cloudy"

# Add a resource: city data by URI
@mcp.resource("city://{name}")
def city_info(name: str) -> str:
    """Get information about a city."""
    cities = {
        "dubai": "Dubai, UAE. Population: 3.6M. Timezone: UTC+4.",
        "london": "London, UK. Population: 8.9M. Timezone: UTC+0.",
        "tokyo": "Tokyo, Japan. Population: 14M. Timezone: UTC+9.",
    }
    return cities.get(name.lower(), f"City '{name}' not found.")

# Add a prompt template
@mcp.prompt()
def travel_planning(city: str) -> str:
    """Generate a travel planning prompt for a destination."""
    return (
        f"You are a travel assistant helping someone plan a trip to {city}. "
        f"Provide practical advice on weather, transportation, and attractions."
    )

# Run with stdio transport (default)
if __name__ == "__main__":
    mcp.run()
Enter fullscreen mode Exit fullscreen mode

Install it and run:

pip install "mcp[cli]"
python weather_server.py
Enter fullscreen mode Exit fullscreen mode

The server starts on stdio by default. For HTTP transport, change the last line:

mcp.run(transport="streamable-http")
Enter fullscreen mode Exit fullscreen mode

Testing with the MCP Inspector

The official MCP Inspector is a browser-based tool for testing servers:

npx -y @modelcontextprotocol/inspector
Enter fullscreen mode Exit fullscreen mode

Point it at your server endpoint (or stdio command) and you can browse resources, invoke tools, and inspect messages without writing a client.

MCP vs the alternatives

Feature MCP Custom API / REST LangChain Tools OpenAI function calling
Standardized protocol Yes No No (framework-specific) No (API-specific)
Primitive types Resources, Tools, Prompts Endpoints only Tools only Functions only
Transport options stdio, SSE, Streamable HTTP HTTP only In-process only HTTP only
Bidirectional Yes (sampling, roots) Request-response only Request-response only Request-response only
Auth model OAuth 2.1 (spec), pluggable Custom per API Custom per integration API key
Client independence Any MCP client One client per API LangChain only OpenAI only

The main differentiator is client independence. A server written for MCP works with any MCP-compatible client: Claude Code, Claude Desktop, the Continue.dev VS Code extension, or a custom agent framework. Custom APIs and framework-specific tools lock you into one ecosystem.

Common pitfalls

Thinking tools are free. Tools execute arbitrary code on your server. Every tool invocation consumes compute and may have side effects. The LLM cannot distinguish between a cheap operation (reading a config file) and an expensive one (running a 100-row batch query). Set usage limits or implement a permission layer for destructive operations.

Resource URIs must be meaningful. A resource URI is not just a label -- it is the identifier the LLM uses to request data. Using opaque URIs (resource://abc123) makes it impossible for the LLM to discover resources. Use hierarchical, descriptive URIs that hint at the content structure, like docs://project/api/reference or db://customers/orders?status=pending.

Forgetting the capability handshake. If you add a new tool to an existing server and your client does not re-negotiate capabilities, the client will not know the tool exists. The capability exchange happens at connection time. Restart both sides after changing what a server offers.

Overloading a server. An MCP server that exposes 50 tools and 200 resources becomes as hard to navigate as a REST API with 50 endpoints. Group related functionality into separate servers and let the client connect to multiple servers. Claude Desktop and other hosts already support multi-server setups.

Assuming tools are always available to the LLM. Tool invocation requires user consent in most host applications. The user must approve each tool call. Design your tools to be meaningful in a single invocation, because multi-step approval flows create a poor user experience.

When NOT to use it

MCP is the wrong choice if:

  • You are building a single-purpose script. If your Python script calls one API and prints the result, MCP adds unnecessary complexity. Just use requests directly.
  • You need sub-millisecond latency. The JSON-RPC serialization and transport overhead adds a few milliseconds per call. For latency-critical, high-frequency operations (real-time streaming inference, hardware control), use a direct connection.
  • Your data source has no LLM interaction. MCP is designed to serve context to LLMs. If you are building a regular web application backend with no AI component, use a standard REST or gRPC API.
  • Your users are all on one framework. If every consumer of your service uses LangChain and will only ever use LangChain, writing a LangChain tool directly is simpler than writing an MCP server plus a LangChain MCP adapter. MCP pays off when you have multiple client types.

TL;DR

  • MCP standardizes how LLM applications connect to data sources. One server works with any MCP-compatible client.
  • The protocol uses JSON-RPC 2.0 over stdio, SSE, or Streamable HTTP transport. Features are negotiated at connection time.
  • Servers expose Resources (data), Tools (executable functions), and Prompts (templates). Clients can offer Sampling, Roots, and Elicitation.
  • The Python SDK mcp (v1.27.2) provides FastMCP, a decorator-based API for building servers in a few lines of code.
  • MCP pays off when you have multiple client types consuming the same data sources. For single-purpose scripts or single-framework setups, a direct integration is simpler.
  • Use the MCP Inspector (npx @modelcontextprotocol/inspector) to test servers without writing a client.

Next post: building a multi-server MCP setup that connects a code search service, a documentation index, and a database gateway into a single AI assistant, with practical trade-offs on transport selection and auth.

Top comments (0)