Himanshu Agarwal

Posted on Jul 1

MCP Server — Scratch to Production

#python #ai #mcp #tutorial

Introduction

This is a build guide, not a lecture. By the end you will have a Model Context Protocol (MCP) server that starts as a 20-line script and ends as a Dockerized, authenticated, tested, monitored service wired into Claude Desktop, VS Code, and Cursor — plus a CI/CD pipeline that ships it.

Every command here runs. Every code block is executable or explicitly marked partial. If you are an SDET, automation engineer, backend developer, or AI engineer who wants MCP fluency without wading through theory, this is written for you.

We build in Python using the official MCP SDK (mcp) and its high-level FastMCP API. The same concepts map cleanly to the TypeScript SDK, and I flag the differences where they matter.

What You Will Build

A production MCP server called toolhub that exposes:

Tools — an order-lookup tool and a SQL-safe query tool that hit a real backend.
Resources — read-only data endpoints (config, catalog) addressed by URI.
Prompts — reusable templated prompts your host app can pull on demand.

Layered on top: structured logging, token authentication, environment-based config, Docker packaging, unit and integration tests, a GitHub Actions pipeline, health checks, and a Prometheus metrics endpoint.

Final Architecture

                     ┌──────────────────────────────────────┐
                     │            MCP HOSTS                  │
                     │  Claude Desktop │ VS Code │ Cursor    │
                     └───────┬───────────┬───────────┬──────┘
                             │ stdio     │ HTTP      │ HTTP
                             │           │           │
                     ┌───────▼───────────▼───────────▼──────┐
                     │        Transport Layer               │
                     │   stdio  |  Streamable HTTP (+OAuth)  │
                     └──────────────────┬───────────────────┘
                                        │ JSON-RPC 2.0
                     ┌──────────────────▼───────────────────┐
                     │        toolhub  MCP SERVER            │
                     │                                       │
                     │  ┌─────────┐ ┌──────────┐ ┌────────┐  │
                     │  │  Tools  │ │Resources │ │Prompts │  │
                     │  └────┬────┘ └────┬─────┘ └───┬────┘  │
                     │       │           │           │       │
                     │  ┌────▼───────────▼───────────▼────┐  │
                     │  │  Auth · Logging · Error Handler  │  │
                     │  └────────────────┬─────────────────┘ │
                     └───────────────────┼───────────────────┘
                                         │
                     ┌───────────────────▼───────────────────┐
                     │  Backends: Postgres · REST API · Cache │
                     └────────────────────────────────────────┘

1. What Is an MCP Server (the 2-minute version)

MCP is an open protocol that standardizes how AI applications connect to external tools and data. Think of it as a USB-C port for LLMs: one connector, many devices. It is built on JSON-RPC 2.0 and was open-sourced by Anthropic in late 2024. By 2026 it is supported by every major vendor — Anthropic, OpenAI, Google, Microsoft, AWS.

There are three roles:

Role	What it is	Example
Host	The AI app the user talks to	Claude Desktop, Cursor, VS Code
Client	A session inside the host, one per server	Managed automatically
Server	Your code exposing capabilities	`toolhub` (what we build)

A server exposes exactly three primitives:

Tools — functions the model can call (side effects allowed). Model-controlled.
Resources — read-only data the host can load into context, addressed by URI. App-controlled.
Prompts — reusable prompt templates the user can invoke. User-controlled.

That distinction matters. Tools are for actions, resources are for context, prompts are for shortcuts. Mixing them up is the most common design mistake. Move on.

2. Architecture

MCP is client-server over JSON-RPC 2.0. The host spawns or connects to your server, negotiates capabilities during an initialize handshake, then exchanges typed messages.

Transports decide how bytes move:

Transport	Use case	Notes
stdio	Local server, same machine	Host spawns your process, talks over stdin/stdout. Fast, simple, no network.
Streamable HTTP	Remote / networked server	HTTP with chunked streaming. Preferred for production remote deployments. Pairs with OAuth 2.1.
SSE	Legacy remote	Deprecated by the spec; still supported but being phased out. Do not build new servers on it.

Rule of thumb: stdio for local desktop integrations, Streamable HTTP for anything remote or shared. We build both, because a real server should support each depending on where it runs.

3. Prerequisites

You need:

Python 3.10+ (3.12 recommended)
uv — the fast Python package/project manager (pip works too, uv is smoother)
Docker — for packaging
Node.js 18+ — only for the MCP Inspector and TypeScript examples
An MCP host to test against: Claude Desktop, VS Code, or Cursor

Verify what you have:

python3 --version    # 3.10 or higher
docker --version
node --version

4. Installing Everything

Install uv:

# macOS / Linux
curl -LsSf https://astral.sh/uv/install.sh | sh

# Windows (PowerShell)
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"

Install the MCP Inspector (a browser tool for poking your server — invaluable):

npx @modelcontextprotocol/inspector --help

We install the SDK itself inside the project in the next step, so it lands in an isolated environment.

5. Project Setup

Create and initialize the project with uv:

uv init toolhub
cd toolhub

Add dependencies. Pin the SDK below v2 — v1.x is the stable line recommended for production, and pip/uv won't auto-select the v2 pre-release, but pinning protects you when v2 goes stable:

uv add "mcp[cli]>=1.27,<2"
uv add pydantic python-dotenv structlog httpx
uv add --dev pytest pytest-asyncio ruff mypy

Quick sanity check:

uv run python -c "import mcp; print('mcp ok')"

Common mistake: installing mcp globally with plain pip. Different hosts spawn your server with different interpreters; a global install means "works on my machine" and nowhere else. Always keep it project-local.

6. Folder Structure

Here is the layout we grow into. Set it up now so nothing feels bolted-on later:

toolhub/
├── pyproject.toml
├── uv.lock
├── .env.example
├── .gitignore
├── Dockerfile
├── .dockerignore
├── README.md
├── src/
│   └── toolhub/
│       ├── __init__.py
│       ├── server.py          # FastMCP instance + wiring
│       ├── config.py          # env-driven settings
│       ├── logging_conf.py    # structured logging
│       ├── auth.py            # token / OAuth verification
│       ├── errors.py          # custom exceptions
│       ├── backends.py        # DB / REST clients
│       ├── tools/
│       │   ├── __init__.py
│       │   └── orders.py
│       ├── resources/
│       │   ├── __init__.py
│       │   └── catalog.py
│       └── prompts/
│           ├── __init__.py
│           └── templates.py
├── tests/
│   ├── test_tools.py
│   ├── test_resources.py
│   └── test_integration.py
└── .github/
    └── workflows/
        └── ci.yml

Create the skeleton:

mkdir -p src/toolhub/{tools,resources,prompts} tests .github/workflows
touch src/toolhub/{__init__.py,server.py,config.py,logging_conf.py,auth.py,errors.py,backends.py}
touch src/toolhub/tools/{__init__.py,orders.py}
touch src/toolhub/resources/{__init__.py,catalog.py}
touch src/toolhub/prompts/{__init__.py,templates.py}

7. Writing the First MCP Server

Start minimal. Put this in src/toolhub/server.py:

# src/toolhub/server.py
from mcp.server.fastmcp import FastMCP

mcp = FastMCP("toolhub")

@mcp.tool()
def ping() -> str:
    """Health check. Returns 'pong'."""
    return "pong"

if __name__ == "__main__":
    mcp.run()  # defaults to stdio

Run it:

uv run python -m toolhub.server

It sits waiting on stdin — that's correct for stdio. Kill it with Ctrl+C. To actually interact, use the Inspector:

npx @modelcontextprotocol/inspector uv run python -m toolhub.server

The Inspector opens a browser UI. Under Tools, call ping, and you get pong. Notice what you did not write: no JSON Schema, no request parsing, no validation. The type hints (-> str) are the schema, and the docstring is the tool description that the model reads.

8. Registering Tools

Tools are the model-callable actions. Keep them in their own module and register them against the shared mcp instance.

src/toolhub/tools/orders.py:

# src/toolhub/tools/orders.py
from pydantic import BaseModel, Field

class Order(BaseModel):
    order_id: str
    status: str
    total: float
    customer: str

# A tiny fake store so the example runs end to end.
_ORDERS = {
    "A100": Order(order_id="A100", status="shipped",   total=249.0, customer="Ravi"),
    "A101": Order(order_id="A101", status="processing", total=89.5,  customer="Meera"),
}

def register(mcp):
    @mcp.tool()
    def get_order(order_id: str) -> Order:
        """Look up an order by its ID and return status, total, and customer."""
        order = _ORDERS.get(order_id.upper())
        if order is None:
            raise ValueError(f"Order '{order_id}' not found")
        return order

    @mcp.tool()
    def list_orders(status: str | None = Field(
        default=None, description="Filter by status, e.g. 'shipped'"
    )) -> list[Order]:
        """List all orders, optionally filtered by status."""
        values = list(_ORDERS.values())
        if status:
            values = [o for o in values if o.status == status]
        return values

Wire it into server.py:

# src/toolhub/server.py
from mcp.server.fastmcp import FastMCP
from toolhub.tools import orders

mcp = FastMCP("toolhub")

@mcp.tool()
def ping() -> str:
    """Health check. Returns 'pong'."""
    return "pong"

orders.register(mcp)

if __name__ == "__main__":
    mcp.run()

Tool design best practices:

Write descriptions for the model, not for humans. The model picks tools based on the docstring. "Look up an order by its ID" beats "order fn."
Return typed objects (Pydantic models). The SDK emits structured output automatically, so clients get clean JSON, not stringified blobs.
One tool, one job. Don't build a do_everything(action, payload) dispatcher — the model can't reason about it.
Name with verbs: get_order, create_ticket, cancel_shipment.

Common mistake: returning giant payloads. Every tool result is fed back into the model's context and costs tokens. Return the fields the model needs, paginate the rest.

9. Resource APIs

Resources are read-only data addressed by URI. The host loads them into context; the model does not "call" them the way it calls tools. Use resources for config, catalogs, docs — stable reference data.

src/toolhub/resources/catalog.py:

# src/toolhub/resources/catalog.py
_CATALOG = {
    "SKU-1": {"name": "Wireless Mouse", "price": 999,  "stock": 42},
    "SKU-2": {"name": "Mechanical Keyboard", "price": 4999, "stock": 7},
}

def register(mcp):
    @mcp.resource("catalog://all")
    def all_products() -> dict:
        """The full product catalog."""
        return _CATALOG

    @mcp.resource("catalog://item/{sku}")
    def product(sku: str) -> dict:
        """A single product by SKU. Templated URI."""
        item = _CATALOG.get(sku.upper())
        if item is None:
            raise ValueError(f"SKU '{sku}' not found")
        return item

Two patterns are shown: a static resource (catalog://all) and a templated resource (catalog://item/{sku}) where {sku} is bound from the URI. Register it in server.py with catalog.register(mcp).

Tools vs Resources — the decision:

Question	Tool	Resource
Does it have side effects?	Yes	Never
Who decides to invoke it?	The model	The host/user
Is it an action or data?	Action	Data

10. Prompt Templates

Prompts are reusable, parameterized prompt snippets your users invoke by name. Great for standardizing team workflows ("summarize this order dispute", "generate a test plan").

src/toolhub/prompts/templates.py:

# src/toolhub/prompts/templates.py
def register(mcp):
    @mcp.prompt()
    def order_summary(order_id: str, tone: str = "concise") -> str:
        """Ask the model to summarize an order for a support agent."""
        return (
            f"Summarize order {order_id} for a support agent. "
            f"Use a {tone} tone. Call the get_order tool if you need details, "
            f"then state status, total, and any action the agent should take."
        )

    @mcp.prompt()
    def test_plan(feature: str) -> str:
        """Generate a QA test plan for a feature."""
        return (
            f"Write a test plan for the feature: '{feature}'. "
            f"Include positive cases, negative cases, boundary cases, "
            f"and one security consideration. Output as a Markdown table."
        )

Register with templates.register(mcp). In Claude Desktop these show up as slash-command-style prompts the user can pick.

Learn MCP Faster

If you'd like complete production-ready guides, interview questions, testing strategies, and hands-on MCP resources that go deeper than any single article can, they're worth a look:

HimanshuAI Playbook Store — https://himanshuai.gumroad.com/

Recommended: MCP Mastery Pack — https://himanshuai.gumroad.com/l/MCP-Mastery-Pack

The Mastery Pack bundles the patterns in this article into a working reference project plus an interview-prep set, so you can skip the trial-and-error and ship faster. Handy if you're preparing for a GenAI or SDET role where MCP now shows up in the loop.

11. Error Handling

Never let a raw stack trace cross the protocol boundary. Define your own exceptions and translate them into clean tool errors.

src/toolhub/errors.py:

# src/toolhub/errors.py
class ToolHubError(Exception):
    """Base class for known, user-safe errors."""

class NotFound(ToolHubError):
    pass

class Unauthorized(ToolHubError):
    pass

class BackendUnavailable(ToolHubError):
    pass

FastMCP catches exceptions raised inside a tool and returns them as an error result to the client. The important part is that you control the message. Raise your own typed errors with safe text:

# inside a tool
from toolhub.errors import NotFound

@mcp.tool()
def get_order(order_id: str) -> Order:
    """Look up an order by its ID."""
    order = _ORDERS.get(order_id.upper())
    if order is None:
        raise NotFound(f"No order with id {order_id}")
    return order

Rules:

Never leak secrets, SQL, file paths, or internal hostnames in error text.
Distinguish user errors (bad input → tell the model what to fix) from system errors (backend down → generic "temporarily unavailable").
Validate input at the boundary with Pydantic types; let the SDK reject malformed calls before your code runs.

Common mistake: except Exception: return "error". You lose all diagnostics. Log the full exception server-side, return a safe summary to the client.

12. Logging

There is one hard rule for stdio servers: never write logs to stdout. stdout is the JSON-RPC channel. A stray print() corrupts the protocol and the host disconnects. Log to stderr (or a file).

src/toolhub/logging_conf.py:

# src/toolhub/logging_conf.py
import logging
import sys
import structlog

def configure_logging(level: str = "INFO"):
    logging.basicConfig(
        format="%(message)s",
        stream=sys.stderr,          # critical: stderr, not stdout
        level=getattr(logging, level.upper(), logging.INFO),
    )
    structlog.configure(
        processors=[
            structlog.processors.add_log_level,
            structlog.processors.TimeStamper(fmt="iso"),
            structlog.processors.JSONRenderer(),
        ],
        logger_factory=structlog.stdlib.LoggerFactory(),
    )
    return structlog.get_logger()

Use it:

log = configure_logging()
log.info("tool.called", tool="get_order", order_id=order_id)

Structured JSON logs mean your monitoring stack (Loki, ELK, CloudWatch) can filter by tool, order_id, or level without regex gymnastics.

13. Authentication

stdio servers inherit the host's OS permissions, so auth there is mostly about what the server itself connects to (protect the DB creds). Remote Streamable HTTP servers are exposed on the network and must authenticate callers. The spec standardizes on OAuth 2.1 for production remote servers.

For a self-hosted service, the pragmatic path is a bearer token verified on every request, with OAuth 2.1 as the upgrade when you integrate an identity provider.

src/toolhub/auth.py:

# src/toolhub/auth.py
import hmac
from toolhub.errors import Unauthorized

def verify_token(provided: str | None, expected: str) -> None:
    """Constant-time bearer-token check for HTTP transport."""
    if not provided:
        raise Unauthorized("Missing bearer token")
    scheme, _, token = provided.partition(" ")
    if scheme.lower() != "bearer" or not hmac.compare_digest(token, expected):
        raise Unauthorized("Invalid bearer token")

Wire it as middleware when you run Streamable HTTP. FastMCP exposes a Starlette/ASGI app you can wrap:

# partial — shows the pattern, plug into your ASGI runner
from starlette.middleware.base import BaseHTTPMiddleware
from toolhub.auth import verify_token
from toolhub.config import settings

class AuthMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request, call_next):
        if request.url.path.startswith("/mcp"):
            try:
                verify_token(request.headers.get("authorization"), settings.api_token)
            except Exception:
                from starlette.responses import JSONResponse
                return JSONResponse({"error": "unauthorized"}, status_code=401)
        return await call_next(request)

Auth best practices:

Use constant-time comparison (hmac.compare_digest) — never == on secrets.
Rotate tokens; keep them in a secrets manager, never in code or the repo.
For multi-tenant or public servers, graduate to OAuth 2.1 with short-lived access tokens.
Review any third-party stdio server's source before adding it — it runs with your permissions.

14. Environment Variables

Config comes from the environment, never hardcoded. Use a typed settings object.

src/toolhub/config.py:

# src/toolhub/config.py
import os
from dataclasses import dataclass
from dotenv import load_dotenv

load_dotenv()  # loads .env in development; no-op in prod if absent

@dataclass(frozen=True)
class Settings:
    api_token: str = os.getenv("TOOLHUB_API_TOKEN", "")
    database_url: str = os.getenv("DATABASE_URL", "")
    log_level: str = os.getenv("LOG_LEVEL", "INFO")
    http_host: str = os.getenv("HTTP_HOST", "127.0.0.1")
    http_port: int = int(os.getenv("HTTP_PORT", "8000"))

settings = Settings()

.env.example (commit this; never commit the real .env):

# .env.example
TOOLHUB_API_TOKEN=replace-me
DATABASE_URL=postgresql://user:pass@localhost:5432/toolhub
LOG_LEVEL=INFO
HTTP_HOST=0.0.0.0
HTTP_PORT=8000

.gitignore must include:

.env
.venv/
__pycache__/
*.pyc
.uv/

Common mistake: committing .env. Add it to .gitignore before your first commit. If it's already in history, rotate every secret it contained.

15. Dockerizing

A production server ships as an image. Use uv inside a slim base for reproducible builds.

Dockerfile:

# Dockerfile
FROM python:3.12-slim AS base

ENV PYTHONUNBUFFERED=1 \
    PYTHONDONTWRITEBYTECODE=1

# Install uv
COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv

WORKDIR /app

# Dependency layer (cached unless lockfile changes)
COPY pyproject.toml uv.lock ./
RUN uv sync --frozen --no-dev

# App layer
COPY src ./src

# Run as non-root
RUN useradd -m appuser
USER appuser

EXPOSE 8000
CMD ["uv", "run", "python", "-m", "toolhub.server", "--http"]

.dockerignore:

.env
.venv/
.uv/
__pycache__/
tests/
.github/
*.md

Build and run:

docker build -t toolhub:latest .
docker run --rm -p 8000:8000 --env-file .env toolhub:latest

Best practices: pin the base image, run as non-root, split dependency and app layers for cache hits, and keep tests/ out of the image.

16. Local Testing

Add an entrypoint that switches transports, so the same image runs stdio locally and HTTP in production.

Update server.py:

# src/toolhub/server.py
import sys
from mcp.server.fastmcp import FastMCP
from toolhub.config import settings
from toolhub.logging_conf import configure_logging
from toolhub.tools import orders
from toolhub.resources import catalog
from toolhub.prompts import templates

log = configure_logging(settings.log_level)

mcp = FastMCP("toolhub", host=settings.http_host, port=settings.http_port)

@mcp.tool()
def ping() -> str:
    """Health check. Returns 'pong'."""
    return "pong"

orders.register(mcp)
catalog.register(mcp)
templates.register(mcp)

def main():
    if "--http" in sys.argv:
        log.info("server.start", transport="streamable-http",
                 port=settings.http_port)
        mcp.run(transport="streamable-http")
    else:
        log.info("server.start", transport="stdio")
        mcp.run()

if __name__ == "__main__":
    main()

Test with the Inspector against both transports:

# stdio
npx @modelcontextprotocol/inspector uv run python -m toolhub.server

# streamable http
uv run python -m toolhub.server --http
# then in another terminal:
npx @modelcontextprotocol/inspector
# connect to http://localhost:8000/mcp

Click through: list tools, call get_order with A100, read catalog://all, invoke the order_summary prompt. If all four work, the server is sound.

17. Claude Desktop Integration

Claude Desktop reads a JSON config file:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json

Add your server (stdio):

{
  "mcpServers": {
    "toolhub": {
      "command": "uv",
      "args": ["--directory", "/absolute/path/to/toolhub", "run", "python", "-m", "toolhub.server"],
      "env": {
        "LOG_LEVEL": "INFO",
        "DATABASE_URL": "postgresql://user:pass@localhost:5432/toolhub"
      }
    }
  }
}

Use an absolute path for --directory. Fully quit and reopen Claude Desktop (not just close the window). Your tools appear under the tools icon. Ask "look up order A100" and Claude calls get_order.

Troubleshooting: if the server doesn't appear, check the logs at ~/Library/Logs/Claude/mcp*.log (macOS). The two usual culprits are a wrong path and a print() polluting stdout.

18. VS Code Integration

VS Code supports MCP servers through its agent tooling. Add a .vscode/mcp.json in your workspace:

{
  "servers": {
    "toolhub": {
      "type": "stdio",
      "command": "uv",
      "args": ["--directory", "${workspaceFolder}", "run", "python", "-m", "toolhub.server"],
      "env": { "LOG_LEVEL": "INFO" }
    }
  }
}

For a remote HTTP server:

{
  "servers": {
    "toolhub-remote": {
      "type": "http",
      "url": "https://toolhub.internal.example.com/mcp",
      "headers": { "Authorization": "Bearer ${env:TOOLHUB_API_TOKEN}" }
    }
  }
}

Reload the window (Cmd/Ctrl+Shift+P → Developer: Reload Window), open the agent view, and your tools are available. The ${workspaceFolder} and ${env:...} substitutions keep the config portable across machines.

19. Cursor Integration

Cursor uses a similar config. Create .cursor/mcp.json in the project (or ~/.cursor/mcp.json for global):

{
  "mcpServers": {
    "toolhub": {
      "command": "uv",
      "args": ["--directory", "/absolute/path/to/toolhub", "run", "python", "-m", "toolhub.server"],
      "env": { "LOG_LEVEL": "INFO" }
    }
  }
}

Open Cursor Settings → MCP, confirm toolhub shows a green status, and the tools become available in the Composer/agent. Remote servers use a "url" field the same way VS Code does.

Cross-host tip: the stdio config is nearly identical across Claude Desktop, VS Code, and Cursor — command, args, env. Keep one canonical snippet in your README and adapt the wrapper key (mcpServers vs servers).

20. Debugging

Your debugging toolkit, in order of usefulness:

MCP Inspector — the fastest feedback loop. Lists tools/resources/prompts, shows raw JSON-RPC, and surfaces schema errors.
stderr logs — since stdout is reserved, your structured logs on stderr are the source of truth.
Host logs — Claude Desktop (~/Library/Logs/Claude/), VS Code output panel, Cursor MCP panel.

Common failure modes and fixes:

Symptom	Likely cause	Fix
Server won't connect	`print()` on stdout	Route all logging to stderr
"spawn ENOENT"	`command` not on PATH	Use absolute path to `uv`/`python`
Tools missing	Registration not called	Confirm `register(mcp)` runs at import
Schema error	Untyped/loose params	Add type hints; use Pydantic models
Works local, fails in host	Relative `--directory`	Use an absolute path
Auth 401 on HTTP	Token mismatch	Check header format `Bearer <token>`

To debug the raw protocol, run the server manually and watch stderr while the Inspector drives it.

21. Unit Testing

Because tools are plain typed functions, they're trivially unit-testable — no protocol needed. This is the biggest win for SDETs: your MCP logic is just Python.

tests/test_tools.py:

# tests/test_tools.py
import pytest
from toolhub.tools import orders
from toolhub.errors import NotFound
from mcp.server.fastmcp import FastMCP

@pytest.fixture
def mcp():
    m = FastMCP("test")
    orders.register(m)
    return m

def test_get_order_direct():
    # Call the underlying logic through the fake store
    order = orders._ORDERS["A100"]
    assert order.status == "shipped"
    assert order.total == 249.0

@pytest.mark.asyncio
async def test_get_order_via_tool(mcp):
    result = await mcp.call_tool("get_order", {"order_id": "A100"})
    # result is a tuple: (content_blocks, structured_output)
    _, structured = result
    assert structured["order_id"] == "A100"

@pytest.mark.asyncio
async def test_get_order_not_found(mcp):
    with pytest.raises(Exception):
        await mcp.call_tool("get_order", {"order_id": "ZZZ"})

Run:

uv run pytest -v

Test the error paths as hard as the happy paths — that's where production breaks. Assert on structured output, not on log text.

22. Production Deployment

For remote deployment, run the Streamable HTTP transport behind a reverse proxy (nginx, Caddy, or a cloud load balancer) that terminates TLS. Never expose the raw MCP port to the internet without TLS and auth.

A minimal production runtime with health checks:

docker run -d \
  --name toolhub \
  --restart unless-stopped \
  -p 8000:8000 \
  --env-file /etc/toolhub/.env \
  --memory 512m --cpus 1.0 \
  toolhub:latest

An nginx snippet fronting it (partial):

# /etc/nginx/conf.d/toolhub.conf
server {
    listen 443 ssl;
    server_name toolhub.internal.example.com;
    # ssl_certificate / ssl_certificate_key ...

    location /mcp {
        proxy_pass http://127.0.0.1:8000/mcp;
        proxy_http_version 1.1;
        proxy_set_header Connection "";      # keep streaming alive
        proxy_buffering off;                 # important for SSE-style streams
        proxy_read_timeout 300s;
    }
}

Deployment checklist: TLS everywhere, auth enforced, non-root container, resource limits set, --restart policy, and secrets from a manager — not the image.

23. CI/CD

Automate lint, type-check, test, and image build with GitHub Actions.

.github/workflows/ci.yml:

name: ci
on:
  push: { branches: [main] }
  pull_request:

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Install uv
        uses: astral-sh/setup-uv@v5
      - name: Sync deps
        run: uv sync --frozen
      - name: Lint
        run: uv run ruff check .
      - name: Type check
        run: uv run mypy src
      - name: Test
        run: uv run pytest -v

  docker:
    needs: test
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Build image
        run: docker build -t toolhub:${{ github.sha }} .
      # - name: Push to registry
      #   run: |  # add registry login + push here

Gate merges on this workflow. The docker job depends on test, so a red test never produces an image.

24. Monitoring

Expose health and metrics endpoints so your ops stack can watch the server. With the HTTP transport you can mount extra routes on the underlying ASGI app.

# partial — mounting health + metrics on the FastMCP ASGI app
from starlette.responses import PlainTextResponse, JSONResponse

app = mcp.streamable_http_app()  # underlying ASGI application

async def health(request):
    return JSONResponse({"status": "ok"})

_metrics = {"tool_calls": 0, "errors": 0}

async def metrics(request):
    lines = [f"toolhub_{k} {v}" for k, v in _metrics.items()]
    return PlainTextResponse("\n".join(lines) + "\n")

app.add_route("/healthz", health)
app.add_route("/metrics", metrics)   # Prometheus scrape target

Increment _metrics["tool_calls"] inside a shared wrapper and _metrics["errors"] in your error handler. Scrape /metrics with Prometheus, alert on error rate and latency, and hook /healthz into your load balancer.

Watch these signals: tool call rate, error rate, p95 latency per tool, and backend connection failures.

25. Performance Optimization

MCP servers are usually I/O bound (DB, HTTP), so async and pooling matter more than raw CPU.

Use async tools for I/O. Declare async def tools and await your backend calls so one slow query doesn't block others.
Pool connections. Reuse a single httpx.AsyncClient and a DB pool across calls instead of creating them per request.
Cache stable resources. Catalog and config rarely change — cache with a short TTL.
Keep results lean. Smaller tool outputs mean fewer tokens and faster round-trips.
Paginate. Never return unbounded lists; accept limit/offset.

Async backend pattern:

# src/toolhub/backends.py
import httpx

class RestBackend:
    def __init__(self, base_url: str):
        self._client = httpx.AsyncClient(base_url=base_url, timeout=10.0)

    async def get_json(self, path: str) -> dict:
        resp = await self._client.get(path)
        resp.raise_for_status()
        return resp.json()

    async def aclose(self):
        await self._client.aclose()

A well-tuned FastMCP server on modest hardware handles well over a thousand concurrent connections — but only if you don't block the event loop with synchronous I/O.

26. Security Checklist

[ ] All secrets from environment/secrets manager, never in code or git history
[ ] .env in .gitignore before the first commit
[ ] Constant-time token comparison; OAuth 2.1 for public remote servers
[ ] TLS on every remote endpoint
[ ] Input validated with Pydantic types at the boundary
[ ] SQL via parameterized queries only — never string interpolation
[ ] Error messages leak nothing sensitive (paths, SQL, hostnames)
[ ] Container runs as non-root with resource limits
[ ] Third-party stdio servers audited before install (they run with your permissions)
[ ] Least privilege: the server's DB/API creds can only do what its tools need
[ ] Rate limiting on public endpoints
[ ] Dependencies pinned and scanned (uv, Dependabot)

27. Common Interview Questions

Short, correct answers to what shows up in GenAI/SDET loops:

Q: What is MCP in one line?
An open JSON-RPC 2.0 protocol that standardizes how AI apps connect to external tools and data.

Q: Tools vs Resources vs Prompts?
Tools are model-controlled actions (side effects allowed); resources are app-controlled read-only data by URI; prompts are user-invoked templates.

Q: What transports exist?
stdio (local), Streamable HTTP (remote, preferred), and legacy SSE (deprecated).

Q: Why can't you print() in a stdio server?
stdout is the JSON-RPC channel; printing corrupts the protocol. Log to stderr.

Q: How does the model know which tool to use?
From the tool name, its typed schema, and the docstring description — write them for the model.

Q: How do you secure a remote MCP server?
TLS + auth (bearer token or OAuth 2.1), input validation, least-privilege backend creds, rate limiting.

Q: How do you test an MCP tool?
As a plain Python function — unit test the logic directly and via call_tool; no live host needed.

Q: What is the initialize handshake?
The capability negotiation the host and server perform on connection before exchanging tool/resource messages.

28. Real-World Production Use Cases

Support automation — order lookup, refunds, ticket creation exposed as tools so an agent resolves cases end to end.
Internal data access — a governed SQL tool that runs parameterized, allow-listed queries against a warehouse, giving analysts natural-language access without raw DB credentials.
Test generation & execution (SDET) — tools that generate test plans, trigger CI runs, and read results; resources that expose the current test suite.
DevOps copilots — tools to read logs, check deploy status, and roll back, with every destructive action gated behind confirmation.
Document/knowledge access — resources exposing internal docs and runbooks the model loads on demand instead of stuffing into a static prompt.

The pattern is constant: wrap an existing system in tools + resources, add auth and logging, ship it.

29. Complete GitHub Project Structure

toolhub/
├── pyproject.toml            # deps, pinned mcp>=1.27,<2
├── uv.lock                   # reproducible builds
├── .env.example              # documented, committed
├── .gitignore                # .env, .venv, __pycache__
├── Dockerfile                # non-root, layered, slim
├── .dockerignore
├── README.md                 # setup + host configs
├── src/toolhub/
│   ├── server.py             # FastMCP wiring, transport switch
│   ├── config.py             # typed env settings
│   ├── logging_conf.py       # structured logs → stderr
│   ├── auth.py               # token / OAuth verification
│   ├── errors.py             # typed, user-safe exceptions
│   ├── backends.py           # pooled async clients
│   ├── tools/orders.py       # model-callable actions
│   ├── resources/catalog.py  # URI-addressed read-only data
│   └── prompts/templates.py  # reusable prompt templates
├── tests/
│   ├── test_tools.py
│   ├── test_resources.py
│   └── test_integration.py
└── .github/workflows/ci.yml  # lint · type · test · build

pyproject.toml essentials (partial):

[project]
name = "toolhub"
version = "0.1.0"
requires-python = ">=3.10"
dependencies = [
    "mcp[cli]>=1.27,<2",
    "pydantic",
    "python-dotenv",
    "structlog",
    "httpx",
]

[project.scripts]
toolhub = "toolhub.server:main"

30. Final Production Checklist

[ ] Server runs on both stdio and Streamable HTTP
[ ] Tools return typed Pydantic models with model-facing docstrings
[ ] Resources are read-only and URI-addressed
[ ] Prompts registered and invocable from hosts
[ ] All logging goes to stderr; stdout is protocol-only
[ ] Typed exceptions; no sensitive data in error messages
[ ] Config fully env-driven; .env git-ignored; .env.example committed
[ ] Auth enforced on HTTP (bearer/OAuth 2.1), TLS terminated upstream
[ ] Dockerized: non-root, pinned base, layered, resource-limited
[ ] Unit + integration tests green; error paths covered
[ ] CI runs lint + type-check + tests and gates the image build
[ ] /healthz and /metrics exposed and scraped
[ ] Async I/O with pooled backend clients; results paginated
[ ] Security checklist complete
[ ] Verified live in Claude Desktop, VS Code, and Cursor

Conclusion & Next Steps

You started with a 20-line ping and ended with a Dockerized, authenticated, tested, monitored MCP server wired into three hosts with a CI/CD pipeline behind it. The core loop never changed: type your functions, describe them for the model, wrap your backends, add auth and logging, ship.

Your next steps:

Replace the fake stores in orders.py and catalog.py with your real Postgres/REST backend using the async pattern from Section 25.
Turn on OAuth 2.1 once you have an identity provider, graduating off the static bearer token.
Add a metrics wrapper around every tool so /metrics reflects real traffic, then wire Prometheus alerts.
Publish the image to your registry from the CI docker job and deploy behind TLS.
Write one more tool a week — the fastest way to internalize MCP is to keep wrapping systems your team already uses.

MCP is becoming the default integration layer for agentic AI. A server you can build, test, and operate end to end is a genuinely marketable skill in 2026 — now you have one.

Top comments (4)

Alex Shev • Jul 1

The useful production step for MCP servers is usually not the first tool definition, it is the boring contract around it: input schema, auth, timeout behavior, error text, and a verification command. Once those are explicit, agents call the tool much more predictably.

Himanshu Agarwal • Jul 2

Thanks, Alex! You're absolutely right. It's easy to focus on getting the first tool working, but the production-ready pieces are what really matter. I'll definitely cover those topics in the upcoming parts of the series. Appreciate you taking the time to share this. @alexshev

Alex Shev • Jul 2

Exactly. The first working call is a milestone, but production starts when failures are boring: typed errors, auth boundaries, retries, timeouts, and logs a teammate can actually debug later.

Some comments may only be visible to logged-in visitors. Sign in to view all comments.