galian

Posted on Jun 18

Model Context Protocol Explained: Build Your First MCP Server in Python

#ai #mcp #webdev #programming

If you've integrated an LLM with a database, a ticketing system, and an internal API, you've written the same glue three times — and you'll write it again for the next model and the next tool. That M×N integration problem is exactly what the Model Context Protocol (MCP) was built to kill. Instead of every application hand-rolling a bespoke connector for every tool, MCP defines one open standard that any model and any tool can speak.

The analogy its authors at Anthropic use is deliberately mundane: MCP is "a USB-C port for AI applications." You don't wire each device to each laptop with a custom cable; you agree on one connector and everything interoperates. That framing is the whole point, and it's why MCP went from an Anthropic open-source release in late 2024 to something adopted across the industry — including by OpenAI and Google — by 2026.

This is a practical guide. We'll cover what MCP actually is, the three-part architecture, the primitives you'll use every day, and then build a real, working MCP server in Python that a host like Claude Code or an IDE can call. No hand-waving — by the end you'll have code that runs.

What problem MCP actually solves

Before MCP, "give the model access to our systems" meant writing function-calling glue specific to one provider's SDK, one tool's API, and one application's plumbing. Swap the model and you rewrote the tool layer. Add a tool and you touched every app that needed it. With M applications and N tools, you were on the hook for roughly M×N integrations.

MCP turns that into M+N. Tool authors write one MCP server. Application authors add one MCP client. Any host that speaks MCP can use any server that speaks MCP — no per-pair glue. The server you write for your company's CRM works in Claude Code, in your custom agent, and in whatever host ships next year, without changes.

That's the strategic shift: tools and models become decoupled, and the integration surface stops growing quadratically. Everything below is just the mechanics of how that's achieved.

The architecture: host, client, server

MCP has exactly three roles. Getting these straight makes everything else click.

Host — the LLM application the user interacts with. Claude Code, an AI-enabled IDE, a desktop assistant, or your own agent. The host orchestrates the model and decides which servers to connect to.
Client — a connector that lives inside the host. The host spins up one client per server, and each client keeps a dedicated 1:1 connection to its server. You rarely write this yourself; the host's framework provides it.
Server — a lightweight program that exposes capabilities (tools, data, prompt templates) over the protocol. This is what you build. A server can wrap a local SQLite file, a SaaS API, a filesystem, or anything you can reach with code.

Under the hood, client and server exchange JSON-RPC 2.0 messages over a transport. There are two you care about:

stdio — the server runs as a local subprocess and communicates over standard input/output. Perfect for local tools, dev work, and anything that touches the user's own machine.
Streamable HTTP — the server runs as a remote service reachable over HTTP, with streaming for long-running responses. This is the modern remote transport (it superseded the older HTTP+SSE approach) and it's what you deploy when the server lives somewhere central.

You write your server logic once; choosing stdio vs. HTTP is mostly a deployment decision, not a rewrite.

The three primitives you'll actually use

MCP servers expose capability through three primitives. The distinction between them isn't bureaucratic — it encodes who is in control, which matters enormously for safety and UX.

Tools — model-controlled

Tools are functions the model decides to call: query a database, send an email, hit an API, run a calculation. They can have side effects, so a well-behaved host asks for user approval before executing one. If you've used function calling, tools are the MCP-native, portable version of it. This is the primitive you'll reach for most.

Resources — application-controlled

Resources are read-only data the application pulls into context: a file's contents, a database row, a config blob, a documentation page. They're identified by URI (for example file:///logs/today.log or db://customers/42) and they don't do anything — they inform. The host decides when and whether to load them, which keeps the context window under deliberate control rather than at the model's whim.

Prompts — user-controlled

Prompts are reusable templates the user invokes intentionally — think a slash-command like "summarize this PR" or "draft a release note." They standardize the high-value interactions your server enables so users don't have to re-type elaborate instructions.

The mental model: tools are for the model, resources are for the app, prompts are for the user. Designing on the correct side of that line is the difference between an integration that feels safe and predictable and one that surprises people. That separation of control is also at the heart of building reliable agents, which is why a structured course on designing autonomous AI agents spends real time on it rather than treating every capability as "just a tool."

Build your first MCP server in Python

Enough theory. Let's build a server that exposes a tool, a resource, and a prompt — and actually runs.

The official Python SDK ships a high-level helper, FastMCP, that handles the JSON-RPC plumbing, schema generation, and transport for you. You describe capabilities with decorators; the SDK infers the input schema from your type hints and the description from your docstring.

Setup

The modern toolchain uses uv, but plain pip works too:

# with uv (recommended)
uv init mcp-demo && cd mcp-demo
uv add "mcp[cli]"

# or with pip
pip install "mcp[cli]"

The server

Create server.py:

from mcp.server.fastmcp import FastMCP

# Name your server — hosts show this to the user.
mcp = FastMCP("demo-tools")


@mcp.tool()
def word_count(text: str) -> int:
    """Count the number of words in a piece of text."""
    return len(text.split())


@mcp.tool()
def days_between(start: str, end: str) -> int:
    """Return the number of days between two ISO dates (YYYY-MM-DD)."""
    from datetime import date
    s = date.fromisoformat(start)
    e = date.fromisoformat(end)
    return abs((e - s).days)


@mcp.resource("notes://team")
def team_notes() -> str:
    """Expose the team's shared notes as read-only context."""
    # In real life this would read a file, a DB row, or an API.
    return "Release freeze starts Friday. Owner: platform team."


@mcp.prompt()
def code_review(language: str, code: str) -> str:
    """A reusable prompt template for reviewing a code snippet."""
    return (
        f"You are a senior {language} engineer. Review the code below for "
        f"correctness, security, and readability. Be specific.\n\n{code}"
    )


if __name__ == "__main__":
    # Default transport is stdio — ideal for local hosts.
    mcp.run()

That's a complete, valid MCP server. Notice what you did not write: no JSON-RPC handling, no schema definitions, no transport code. The type hints on word_count(text: str) -> int become the tool's input/output schema automatically, and the docstring becomes the description the model reads to decide when to call it. That docstring is not decoration — it's the model's only instruction manual for the tool, so write it like an API contract.

Inspect it before wiring it to a model

The SDK includes a dev inspector so you can poke at your server without an LLM in the loop:

uv run mcp dev server.py

This launches the MCP Inspector, a local UI where you can list the server's tools, resources, and prompts, call them with hand-entered arguments, and see exactly what comes back. Debugging here — before a model is involved — is the single biggest time-saver in MCP development. If a tool misbehaves with the inspector, the problem is your server, not the model.

Connect it to a host

To use the server from Claude Code or another MCP-aware host, you register it. For Claude Code, that's a one-liner:

claude mcp add demo-tools -- uv run server.py

For hosts configured by file, you add an entry pointing at the command that launches your server:

{
  "mcpServers": {
    "demo-tools": {
      "command": "uv",
      "args": ["run", "server.py"]
    }
  }
}

Restart the host, and your tools, resource, and prompt show up — the model can now call word_count, the app can pull in notes://team, and the user can invoke the code_review prompt. The same server.py, unchanged, works in every one of them. That portability is the entire payoff, and pushing a server like this from a local toy to something production-grade — auth, logging, error handling, deployment over Streamable HTTP — is exactly the jump covered in this hands-on course on building real AI applications in Python.

From toy to production: what the quickstart doesn't tell you

The server above works, but shipping MCP to real users surfaces concerns the happy path hides. These are the ones that bite teams:

Authentication and authorization. A remote MCP server is a service on the internet. Streamable HTTP servers support OAuth-based auth, and you need it — an unauthenticated tool that can query your database or send email is an incident waiting to happen. Treat the server's tool surface as your real attack surface.

The model can be tricked into calling tools. Because tools are model-controlled, a prompt-injection payload hidden in a document or web page the model reads can try to coax it into calling a destructive tool. The mitigations are concrete: keep destructive tools behind user approval, scope each server's permissions narrowly, validate every argument server-side, and never assume the model's call is benign just because it's well-formed. This intersection of capability and risk is precisely why agentic systems need a security mindset, not just a features mindset — the subject of a dedicated course on AI security and ethical engineering.

Tool descriptions are part of your context budget. Every tool's name, description, and schema get loaded into the model's context. Twenty sprawling tools with verbose docstrings quietly eat thousands of tokens and degrade the model's ability to choose well. Curate your tool surface like an API you have to maintain: fewer, sharper tools beat a kitchen sink. Managing what occupies the context window — tools included — is its own discipline, which a course on context engineering for AI agents treats as a first-class skill rather than an afterthought.

Errors must be legible to a model, not just a human. When a tool fails, return a structured, descriptive error the model can reason about and recover from — not a raw stack trace. "Customer 42 not found; verify the ID" lets the model self-correct; a 500 with a Python traceback does not.

Stateful vs. stateless. stdio servers are naturally per-session and local; HTTP servers may serve many clients and need you to think about concurrency and isolation. Decide early, because retrofitting state handling is painful.

None of these are reasons to avoid MCP — they're the normal engineering of turning a protocol demo into a dependable system, and the same skills you'd apply to any service boundary apply here.

Why this matters for how you build in 2026

MCP's quiet significance is that it makes tools a portable asset instead of a per-app liability. Write a great server for your internal systems once, and it appreciates: every new host, every new model, every teammate's agent can use it without you lifting a finger. That's the opposite of the function-calling glue we used to throw away every time a model changed.

It also pushes good architecture by default. The host/client/server split forces a clean seam between "the model and the app" and "the capability," which is exactly the boundary you want when models get swapped, upgraded, or — as 2026 has reminded everyone — occasionally yanked. Building agents on top of well-designed MCP servers, with the right model routed to the right step, is where a lot of the real engineering leverage lives now; if you want that end-to-end picture, there's a focused course on the Model Context Protocol and building enterprise integrations that goes far deeper than a single tutorial can.

Conclusion

The Model Context Protocol is not hype — it's plumbing, and good plumbing is what lets a field scale. It replaces M×N bespoke integrations with M+N reusable ones, gives you three clear primitives with sane control boundaries, and lets you ship a server in a dozen lines of Python that works across every MCP-aware host.

Start small: build the demo-tools server above, poke it with the inspector, wire it into a host you already use. Then point it at something real in your own stack — a read-only resource over your logs, a single well-scoped tool over an internal API. The first time you watch a model use a capability you exposed once and never re-integrated, the M+N promise stops being abstract.

Write the server once. Let every model use it.

Sources & further reading:

Anthropic — Introducing the Model Context Protocol
Model Context Protocol — Official specification and documentation
Model Context Protocol — Python SDK

This article is educational content. APIs and SDK details evolve; check the official MCP documentation for the current specification before building production systems.

DEV Community