Introduction
This is a build guide, not a lecture. By the end you will have a Model Context Protocol (MCP) server that starts as a 20-line script and ends as a Dockerized, authenticated, tested, monitored service wired into Claude Desktop, VS Code, and Cursor — plus a CI/CD pipeline that ships it.
Every command here runs. Every code block is executable or explicitly marked partial. If you are an SDET, automation engineer, backend developer, or AI engineer who wants MCP fluency without wading through theory, this is written for you.
We build in Python using the official MCP SDK (mcp) and its high-level FastMCP API. The same concepts map cleanly to the TypeScript SDK, and I flag the differences where they matter.
What You Will Build
A production MCP server called toolhub that exposes:
- Tools — an order-lookup tool and a SQL-safe query tool that hit a real backend.
- Resources — read-only data endpoints (config, catalog) addressed by URI.
- Prompts — reusable templated prompts your host app can pull on demand.
Layered on top: structured logging, token authentication, environment-based config, Docker packaging, unit and integration tests, a GitHub Actions pipeline, health checks, and a Prometheus metrics endpoint.
Final Architecture
┌──────────────────────────────────────┐
│ MCP HOSTS │
│ Claude Desktop │ VS Code │ Cursor │
└───────┬───────────┬───────────┬──────┘
│ stdio │ HTTP │ HTTP
│ │ │
┌───────▼───────────▼───────────▼──────┐
│ Transport Layer │
│ stdio | Streamable HTTP (+OAuth) │
└──────────────────┬───────────────────┘
│ JSON-RPC 2.0
┌──────────────────▼───────────────────┐
│ toolhub MCP SERVER │
│ │
│ ┌─────────┐ ┌──────────┐ ┌────────┐ │
│ │ Tools │ │Resources │ │Prompts │ │
│ └────┬────┘ └────┬─────┘ └───┬────┘ │
│ │ │ │ │
│ ┌────▼───────────▼───────────▼────┐ │
│ │ Auth · Logging · Error Handler │ │
│ └────────────────┬─────────────────┘ │
└───────────────────┼───────────────────┘
│
┌───────────────────▼───────────────────┐
│ Backends: Postgres · REST API · Cache │
└────────────────────────────────────────┘
1. What Is an MCP Server (the 2-minute version)
MCP is an open protocol that standardizes how AI applications connect to external tools and data. Think of it as a USB-C port for LLMs: one connector, many devices. It is built on JSON-RPC 2.0 and was open-sourced by Anthropic in late 2024. By 2026 it is supported by every major vendor — Anthropic, OpenAI, Google, Microsoft, AWS.
There are three roles:
| Role | What it is | Example |
|---|---|---|
| Host | The AI app the user talks to | Claude Desktop, Cursor, VS Code |
| Client | A session inside the host, one per server | Managed automatically |
| Server | Your code exposing capabilities |
toolhub (what we build) |
A server exposes exactly three primitives:
- Tools — functions the model can call (side effects allowed). Model-controlled.
- Resources — read-only data the host can load into context, addressed by URI. App-controlled.
- Prompts — reusable prompt templates the user can invoke. User-controlled.
That distinction matters. Tools are for actions, resources are for context, prompts are for shortcuts. Mixing them up is the most common design mistake. Move on.
2. Architecture
MCP is client-server over JSON-RPC 2.0. The host spawns or connects to your server, negotiates capabilities during an initialize handshake, then exchanges typed messages.
Transports decide how bytes move:
| Transport | Use case | Notes |
|---|---|---|
| stdio | Local server, same machine | Host spawns your process, talks over stdin/stdout. Fast, simple, no network. |
| Streamable HTTP | Remote / networked server | HTTP with chunked streaming. Preferred for production remote deployments. Pairs with OAuth 2.1. |
| SSE | Legacy remote | Deprecated by the spec; still supported but being phased out. Do not build new servers on it. |
Rule of thumb: stdio for local desktop integrations, Streamable HTTP for anything remote or shared. We build both, because a real server should support each depending on where it runs.
3. Prerequisites
You need:
- Python 3.10+ (3.12 recommended)
- uv — the fast Python package/project manager (pip works too, uv is smoother)
- Docker — for packaging
- Node.js 18+ — only for the MCP Inspector and TypeScript examples
- An MCP host to test against: Claude Desktop, VS Code, or Cursor
Verify what you have:
python3 --version # 3.10 or higher
docker --version
node --version
4. Installing Everything
Install uv:
# macOS / Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# Windows (PowerShell)
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
Install the MCP Inspector (a browser tool for poking your server — invaluable):
npx @modelcontextprotocol/inspector --help
We install the SDK itself inside the project in the next step, so it lands in an isolated environment.
5. Project Setup
Create and initialize the project with uv:
uv init toolhub
cd toolhub
Add dependencies. Pin the SDK below v2 — v1.x is the stable line recommended for production, and pip/uv won't auto-select the v2 pre-release, but pinning protects you when v2 goes stable:
uv add "mcp[cli]>=1.27,<2"
uv add pydantic python-dotenv structlog httpx
uv add --dev pytest pytest-asyncio ruff mypy
Quick sanity check:
uv run python -c "import mcp; print('mcp ok')"
Common mistake: installing
mcpglobally with plainpip. Different hosts spawn your server with different interpreters; a global install means "works on my machine" and nowhere else. Always keep it project-local.
6. Folder Structure
Here is the layout we grow into. Set it up now so nothing feels bolted-on later:
toolhub/
├── pyproject.toml
├── uv.lock
├── .env.example
├── .gitignore
├── Dockerfile
├── .dockerignore
├── README.md
├── src/
│ └── toolhub/
│ ├── __init__.py
│ ├── server.py # FastMCP instance + wiring
│ ├── config.py # env-driven settings
│ ├── logging_conf.py # structured logging
│ ├── auth.py # token / OAuth verification
│ ├── errors.py # custom exceptions
│ ├── backends.py # DB / REST clients
│ ├── tools/
│ │ ├── __init__.py
│ │ └── orders.py
│ ├── resources/
│ │ ├── __init__.py
│ │ └── catalog.py
│ └── prompts/
│ ├── __init__.py
│ └── templates.py
├── tests/
│ ├── test_tools.py
│ ├── test_resources.py
│ └── test_integration.py
└── .github/
└── workflows/
└── ci.yml
Create the skeleton:
mkdir -p src/toolhub/{tools,resources,prompts} tests .github/workflows
touch src/toolhub/{__init__.py,server.py,config.py,logging_conf.py,auth.py,errors.py,backends.py}
touch src/toolhub/tools/{__init__.py,orders.py}
touch src/toolhub/resources/{__init__.py,catalog.py}
touch src/toolhub/prompts/{__init__.py,templates.py}
7. Writing the First MCP Server
Start minimal. Put this in src/toolhub/server.py:
# src/toolhub/server.py
from mcp.server.fastmcp import FastMCP
mcp = FastMCP("toolhub")
@mcp.tool()
def ping() -> str:
"""Health check. Returns 'pong'."""
return "pong"
if __name__ == "__main__":
mcp.run() # defaults to stdio
Run it:
uv run python -m toolhub.server
It sits waiting on stdin — that's correct for stdio. Kill it with Ctrl+C. To actually interact, use the Inspector:
npx @modelcontextprotocol/inspector uv run python -m toolhub.server
The Inspector opens a browser UI. Under Tools, call ping, and you get pong. Notice what you did not write: no JSON Schema, no request parsing, no validation. The type hints (-> str) are the schema, and the docstring is the tool description that the model reads.
8. Registering Tools
Tools are the model-callable actions. Keep them in their own module and register them against the shared mcp instance.
src/toolhub/tools/orders.py:
# src/toolhub/tools/orders.py
from pydantic import BaseModel, Field
class Order(BaseModel):
order_id: str
status: str
total: float
customer: str
# A tiny fake store so the example runs end to end.
_ORDERS = {
"A100": Order(order_id="A100", status="shipped", total=249.0, customer="Ravi"),
"A101": Order(order_id="A101", status="processing", total=89.5, customer="Meera"),
}
def register(mcp):
@mcp.tool()
def get_order(order_id: str) -> Order:
"""Look up an order by its ID and return status, total, and customer."""
order = _ORDERS.get(order_id.upper())
if order is None:
raise ValueError(f"Order '{order_id}' not found")
return order
@mcp.tool()
def list_orders(status: str | None = Field(
default=None, description="Filter by status, e.g. 'shipped'"
)) -> list[Order]:
"""List all orders, optionally filtered by status."""
values = list(_ORDERS.values())
if status:
values = [o for o in values if o.status == status]
return values
Wire it into server.py:
# src/toolhub/server.py
from mcp.server.fastmcp import FastMCP
from toolhub.tools import orders
mcp = FastMCP("toolhub")
@mcp.tool()
def ping() -> str:
"""Health check. Returns 'pong'."""
return "pong"
orders.register(mcp)
if __name__ == "__main__":
mcp.run()
Tool design best practices:
- Write descriptions for the model, not for humans. The model picks tools based on the docstring. "Look up an order by its ID" beats "order fn."
- Return typed objects (Pydantic models). The SDK emits structured output automatically, so clients get clean JSON, not stringified blobs.
-
One tool, one job. Don't build a
do_everything(action, payload)dispatcher — the model can't reason about it. -
Name with verbs:
get_order,create_ticket,cancel_shipment.
Common mistake: returning giant payloads. Every tool result is fed back into the model's context and costs tokens. Return the fields the model needs, paginate the rest.
9. Resource APIs
Resources are read-only data addressed by URI. The host loads them into context; the model does not "call" them the way it calls tools. Use resources for config, catalogs, docs — stable reference data.
src/toolhub/resources/catalog.py:
# src/toolhub/resources/catalog.py
_CATALOG = {
"SKU-1": {"name": "Wireless Mouse", "price": 999, "stock": 42},
"SKU-2": {"name": "Mechanical Keyboard", "price": 4999, "stock": 7},
}
def register(mcp):
@mcp.resource("catalog://all")
def all_products() -> dict:
"""The full product catalog."""
return _CATALOG
@mcp.resource("catalog://item/{sku}")
def product(sku: str) -> dict:
"""A single product by SKU. Templated URI."""
item = _CATALOG.get(sku.upper())
if item is None:
raise ValueError(f"SKU '{sku}' not found")
return item
Two patterns are shown: a static resource (catalog://all) and a templated resource (catalog://item/{sku}) where {sku} is bound from the URI. Register it in server.py with catalog.register(mcp).
Tools vs Resources — the decision:
| Question | Tool | Resource |
|---|---|---|
| Does it have side effects? | Yes | Never |
| Who decides to invoke it? | The model | The host/user |
| Is it an action or data? | Action | Data |
10. Prompt Templates
Prompts are reusable, parameterized prompt snippets your users invoke by name. Great for standardizing team workflows ("summarize this order dispute", "generate a test plan").
src/toolhub/prompts/templates.py:
# src/toolhub/prompts/templates.py
def register(mcp):
@mcp.prompt()
def order_summary(order_id: str, tone: str = "concise") -> str:
"""Ask the model to summarize an order for a support agent."""
return (
f"Summarize order {order_id} for a support agent. "
f"Use a {tone} tone. Call the get_order tool if you need details, "
f"then state status, total, and any action the agent should take."
)
@mcp.prompt()
def test_plan(feature: str) -> str:
"""Generate a QA test plan for a feature."""
return (
f"Write a test plan for the feature: '{feature}'. "
f"Include positive cases, negative cases, boundary cases, "
f"and one security consideration. Output as a Markdown table."
)
Register with templates.register(mcp). In Claude Desktop these show up as slash-command-style prompts the user can pick.
Learn MCP Faster
If you'd like complete production-ready guides, interview questions, testing strategies, and hands-on MCP resources that go deeper than any single article can, they're worth a look:
HimanshuAI Playbook Store — https://himanshuai.gumroad.com/
Recommended: MCP Mastery Pack — https://himanshuai.gumroad.com/l/MCP-Mastery-Pack
The Mastery Pack bundles the patterns in this article into a working reference project plus an interview-prep set, so you can skip the trial-and-error and ship faster. Handy if you're preparing for a GenAI or SDET role where MCP now shows up in the loop.
11. Error Handling
Never let a raw stack trace cross the protocol boundary. Define your own exceptions and translate them into clean tool errors.
src/toolhub/errors.py:
# src/toolhub/errors.py
class ToolHubError(Exception):
"""Base class for known, user-safe errors."""
class NotFound(ToolHubError):
pass
class Unauthorized(ToolHubError):
pass
class BackendUnavailable(ToolHubError):
pass
FastMCP catches exceptions raised inside a tool and returns them as an error result to the client. The important part is that you control the message. Raise your own typed errors with safe text:
# inside a tool
from toolhub.errors import NotFound
@mcp.tool()
def get_order(order_id: str) -> Order:
"""Look up an order by its ID."""
order = _ORDERS.get(order_id.upper())
if order is None:
raise NotFound(f"No order with id {order_id}")
return order
Rules:
- Never leak secrets, SQL, file paths, or internal hostnames in error text.
- Distinguish user errors (bad input → tell the model what to fix) from system errors (backend down → generic "temporarily unavailable").
- Validate input at the boundary with Pydantic types; let the SDK reject malformed calls before your code runs.
Common mistake:
except Exception: return "error". You lose all diagnostics. Log the full exception server-side, return a safe summary to the client.
12. Logging
There is one hard rule for stdio servers: never write logs to stdout. stdout is the JSON-RPC channel. A stray print() corrupts the protocol and the host disconnects. Log to stderr (or a file).
src/toolhub/logging_conf.py:
# src/toolhub/logging_conf.py
import logging
import sys
import structlog
def configure_logging(level: str = "INFO"):
logging.basicConfig(
format="%(message)s",
stream=sys.stderr, # critical: stderr, not stdout
level=getattr(logging, level.upper(), logging.INFO),
)
structlog.configure(
processors=[
structlog.processors.add_log_level,
structlog.processors.TimeStamper(fmt="iso"),
structlog.processors.JSONRenderer(),
],
logger_factory=structlog.stdlib.LoggerFactory(),
)
return structlog.get_logger()
Use it:
log = configure_logging()
log.info("tool.called", tool="get_order", order_id=order_id)
Structured JSON logs mean your monitoring stack (Loki, ELK, CloudWatch) can filter by tool, order_id, or level without regex gymnastics.
13. Authentication
stdio servers inherit the host's OS permissions, so auth there is mostly about what the server itself connects to (protect the DB creds). Remote Streamable HTTP servers are exposed on the network and must authenticate callers. The spec standardizes on OAuth 2.1 for production remote servers.
For a self-hosted service, the pragmatic path is a bearer token verified on every request, with OAuth 2.1 as the upgrade when you integrate an identity provider.
src/toolhub/auth.py:
# src/toolhub/auth.py
import hmac
from toolhub.errors import Unauthorized
def verify_token(provided: str | None, expected: str) -> None:
"""Constant-time bearer-token check for HTTP transport."""
if not provided:
raise Unauthorized("Missing bearer token")
scheme, _, token = provided.partition(" ")
if scheme.lower() != "bearer" or not hmac.compare_digest(token, expected):
raise Unauthorized("Invalid bearer token")
Wire it as middleware when you run Streamable HTTP. FastMCP exposes a Starlette/ASGI app you can wrap:
# partial — shows the pattern, plug into your ASGI runner
from starlette.middleware.base import BaseHTTPMiddleware
from toolhub.auth import verify_token
from toolhub.config import settings
class AuthMiddleware(BaseHTTPMiddleware):
async def dispatch(self, request, call_next):
if request.url.path.startswith("/mcp"):
try:
verify_token(request.headers.get("authorization"), settings.api_token)
except Exception:
from starlette.responses import JSONResponse
return JSONResponse({"error": "unauthorized"}, status_code=401)
return await call_next(request)
Auth best practices:
- Use constant-time comparison (
hmac.compare_digest) — never==on secrets. - Rotate tokens; keep them in a secrets manager, never in code or the repo.
- For multi-tenant or public servers, graduate to OAuth 2.1 with short-lived access tokens.
- Review any third-party stdio server's source before adding it — it runs with your permissions.
14. Environment Variables
Config comes from the environment, never hardcoded. Use a typed settings object.
src/toolhub/config.py:
# src/toolhub/config.py
import os
from dataclasses import dataclass
from dotenv import load_dotenv
load_dotenv() # loads .env in development; no-op in prod if absent
@dataclass(frozen=True)
class Settings:
api_token: str = os.getenv("TOOLHUB_API_TOKEN", "")
database_url: str = os.getenv("DATABASE_URL", "")
log_level: str = os.getenv("LOG_LEVEL", "INFO")
http_host: str = os.getenv("HTTP_HOST", "127.0.0.1")
http_port: int = int(os.getenv("HTTP_PORT", "8000"))
settings = Settings()
.env.example (commit this; never commit the real .env):
# .env.example
TOOLHUB_API_TOKEN=replace-me
DATABASE_URL=postgresql://user:pass@localhost:5432/toolhub
LOG_LEVEL=INFO
HTTP_HOST=0.0.0.0
HTTP_PORT=8000
.gitignore must include:
.env
.venv/
__pycache__/
*.pyc
.uv/
Common mistake: committing
.env. Add it to.gitignorebefore your first commit. If it's already in history, rotate every secret it contained.
15. Dockerizing
A production server ships as an image. Use uv inside a slim base for reproducible builds.
Dockerfile:
# Dockerfile
FROM python:3.12-slim AS base
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1
# Install uv
COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv
WORKDIR /app
# Dependency layer (cached unless lockfile changes)
COPY pyproject.toml uv.lock ./
RUN uv sync --frozen --no-dev
# App layer
COPY src ./src
# Run as non-root
RUN useradd -m appuser
USER appuser
EXPOSE 8000
CMD ["uv", "run", "python", "-m", "toolhub.server", "--http"]
.dockerignore:
.env
.venv/
.uv/
__pycache__/
tests/
.github/
*.md
Build and run:
docker build -t toolhub:latest .
docker run --rm -p 8000:8000 --env-file .env toolhub:latest
Best practices: pin the base image, run as non-root, split dependency and app layers for cache hits, and keep tests/ out of the image.
16. Local Testing
Add an entrypoint that switches transports, so the same image runs stdio locally and HTTP in production.
Update server.py:
# src/toolhub/server.py
import sys
from mcp.server.fastmcp import FastMCP
from toolhub.config import settings
from toolhub.logging_conf import configure_logging
from toolhub.tools import orders
from toolhub.resources import catalog
from toolhub.prompts import templates
log = configure_logging(settings.log_level)
mcp = FastMCP("toolhub", host=settings.http_host, port=settings.http_port)
@mcp.tool()
def ping() -> str:
"""Health check. Returns 'pong'."""
return "pong"
orders.register(mcp)
catalog.register(mcp)
templates.register(mcp)
def main():
if "--http" in sys.argv:
log.info("server.start", transport="streamable-http",
port=settings.http_port)
mcp.run(transport="streamable-http")
else:
log.info("server.start", transport="stdio")
mcp.run()
if __name__ == "__main__":
main()
Test with the Inspector against both transports:
# stdio
npx @modelcontextprotocol/inspector uv run python -m toolhub.server
# streamable http
uv run python -m toolhub.server --http
# then in another terminal:
npx @modelcontextprotocol/inspector
# connect to http://localhost:8000/mcp
Click through: list tools, call get_order with A100, read catalog://all, invoke the order_summary prompt. If all four work, the server is sound.
17. Claude Desktop Integration
Claude Desktop reads a JSON config file:
-
macOS:
~/Library/Application Support/Claude/claude_desktop_config.json -
Windows:
%APPDATA%\Claude\claude_desktop_config.json
Add your server (stdio):
{
"mcpServers": {
"toolhub": {
"command": "uv",
"args": ["--directory", "/absolute/path/to/toolhub", "run", "python", "-m", "toolhub.server"],
"env": {
"LOG_LEVEL": "INFO",
"DATABASE_URL": "postgresql://user:pass@localhost:5432/toolhub"
}
}
}
}
Use an absolute path for --directory. Fully quit and reopen Claude Desktop (not just close the window). Your tools appear under the tools icon. Ask "look up order A100" and Claude calls get_order.
Troubleshooting: if the server doesn't appear, check the logs at
~/Library/Logs/Claude/mcp*.log(macOS). The two usual culprits are a wrong path and aprint()polluting stdout.
18. VS Code Integration
VS Code supports MCP servers through its agent tooling. Add a .vscode/mcp.json in your workspace:
{
"servers": {
"toolhub": {
"type": "stdio",
"command": "uv",
"args": ["--directory", "${workspaceFolder}", "run", "python", "-m", "toolhub.server"],
"env": { "LOG_LEVEL": "INFO" }
}
}
}
For a remote HTTP server:
{
"servers": {
"toolhub-remote": {
"type": "http",
"url": "https://toolhub.internal.example.com/mcp",
"headers": { "Authorization": "Bearer ${env:TOOLHUB_API_TOKEN}" }
}
}
}
Reload the window (Cmd/Ctrl+Shift+P → Developer: Reload Window), open the agent view, and your tools are available. The ${workspaceFolder} and ${env:...} substitutions keep the config portable across machines.
19. Cursor Integration
Cursor uses a similar config. Create .cursor/mcp.json in the project (or ~/.cursor/mcp.json for global):
{
"mcpServers": {
"toolhub": {
"command": "uv",
"args": ["--directory", "/absolute/path/to/toolhub", "run", "python", "-m", "toolhub.server"],
"env": { "LOG_LEVEL": "INFO" }
}
}
}
Open Cursor Settings → MCP, confirm toolhub shows a green status, and the tools become available in the Composer/agent. Remote servers use a "url" field the same way VS Code does.
Cross-host tip: the stdio config is nearly identical across Claude Desktop, VS Code, and Cursor —
command,args,env. Keep one canonical snippet in your README and adapt the wrapper key (mcpServersvsservers).
20. Debugging
Your debugging toolkit, in order of usefulness:
- MCP Inspector — the fastest feedback loop. Lists tools/resources/prompts, shows raw JSON-RPC, and surfaces schema errors.
- stderr logs — since stdout is reserved, your structured logs on stderr are the source of truth.
-
Host logs — Claude Desktop (
~/Library/Logs/Claude/), VS Code output panel, Cursor MCP panel.
Common failure modes and fixes:
| Symptom | Likely cause | Fix |
|---|---|---|
| Server won't connect |
print() on stdout |
Route all logging to stderr |
| "spawn ENOENT" |
command not on PATH |
Use absolute path to uv/python
|
| Tools missing | Registration not called | Confirm register(mcp) runs at import |
| Schema error | Untyped/loose params | Add type hints; use Pydantic models |
| Works local, fails in host | Relative --directory
|
Use an absolute path |
| Auth 401 on HTTP | Token mismatch | Check header format Bearer <token>
|
To debug the raw protocol, run the server manually and watch stderr while the Inspector drives it.
21. Unit Testing
Because tools are plain typed functions, they're trivially unit-testable — no protocol needed. This is the biggest win for SDETs: your MCP logic is just Python.
tests/test_tools.py:
# tests/test_tools.py
import pytest
from toolhub.tools import orders
from toolhub.errors import NotFound
from mcp.server.fastmcp import FastMCP
@pytest.fixture
def mcp():
m = FastMCP("test")
orders.register(m)
return m
def test_get_order_direct():
# Call the underlying logic through the fake store
order = orders._ORDERS["A100"]
assert order.status == "shipped"
assert order.total == 249.0
@pytest.mark.asyncio
async def test_get_order_via_tool(mcp):
result = await mcp.call_tool("get_order", {"order_id": "A100"})
# result is a tuple: (content_blocks, structured_output)
_, structured = result
assert structured["order_id"] == "A100"
@pytest.mark.asyncio
async def test_get_order_not_found(mcp):
with pytest.raises(Exception):
await mcp.call_tool("get_order", {"order_id": "ZZZ"})
Run:
uv run pytest -v
Test the error paths as hard as the happy paths — that's where production breaks. Assert on structured output, not on log text.
22. Production Deployment
For remote deployment, run the Streamable HTTP transport behind a reverse proxy (nginx, Caddy, or a cloud load balancer) that terminates TLS. Never expose the raw MCP port to the internet without TLS and auth.
A minimal production runtime with health checks:
docker run -d \
--name toolhub \
--restart unless-stopped \
-p 8000:8000 \
--env-file /etc/toolhub/.env \
--memory 512m --cpus 1.0 \
toolhub:latest
An nginx snippet fronting it (partial):
# /etc/nginx/conf.d/toolhub.conf
server {
listen 443 ssl;
server_name toolhub.internal.example.com;
# ssl_certificate / ssl_certificate_key ...
location /mcp {
proxy_pass http://127.0.0.1:8000/mcp;
proxy_http_version 1.1;
proxy_set_header Connection ""; # keep streaming alive
proxy_buffering off; # important for SSE-style streams
proxy_read_timeout 300s;
}
}
Deployment checklist: TLS everywhere, auth enforced, non-root container, resource limits set, --restart policy, and secrets from a manager — not the image.
23. CI/CD
Automate lint, type-check, test, and image build with GitHub Actions.
.github/workflows/ci.yml:
name: ci
on:
push: { branches: [main] }
pull_request:
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install uv
uses: astral-sh/setup-uv@v5
- name: Sync deps
run: uv sync --frozen
- name: Lint
run: uv run ruff check .
- name: Type check
run: uv run mypy src
- name: Test
run: uv run pytest -v
docker:
needs: test
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Build image
run: docker build -t toolhub:${{ github.sha }} .
# - name: Push to registry
# run: | # add registry login + push here
Gate merges on this workflow. The docker job depends on test, so a red test never produces an image.
24. Monitoring
Expose health and metrics endpoints so your ops stack can watch the server. With the HTTP transport you can mount extra routes on the underlying ASGI app.
# partial — mounting health + metrics on the FastMCP ASGI app
from starlette.responses import PlainTextResponse, JSONResponse
app = mcp.streamable_http_app() # underlying ASGI application
async def health(request):
return JSONResponse({"status": "ok"})
_metrics = {"tool_calls": 0, "errors": 0}
async def metrics(request):
lines = [f"toolhub_{k} {v}" for k, v in _metrics.items()]
return PlainTextResponse("\n".join(lines) + "\n")
app.add_route("/healthz", health)
app.add_route("/metrics", metrics) # Prometheus scrape target
Increment _metrics["tool_calls"] inside a shared wrapper and _metrics["errors"] in your error handler. Scrape /metrics with Prometheus, alert on error rate and latency, and hook /healthz into your load balancer.
Watch these signals: tool call rate, error rate, p95 latency per tool, and backend connection failures.
25. Performance Optimization
MCP servers are usually I/O bound (DB, HTTP), so async and pooling matter more than raw CPU.
-
Use async tools for I/O. Declare
async deftools andawaityour backend calls so one slow query doesn't block others. -
Pool connections. Reuse a single
httpx.AsyncClientand a DB pool across calls instead of creating them per request. - Cache stable resources. Catalog and config rarely change — cache with a short TTL.
- Keep results lean. Smaller tool outputs mean fewer tokens and faster round-trips.
-
Paginate. Never return unbounded lists; accept
limit/offset.
Async backend pattern:
# src/toolhub/backends.py
import httpx
class RestBackend:
def __init__(self, base_url: str):
self._client = httpx.AsyncClient(base_url=base_url, timeout=10.0)
async def get_json(self, path: str) -> dict:
resp = await self._client.get(path)
resp.raise_for_status()
return resp.json()
async def aclose(self):
await self._client.aclose()
A well-tuned FastMCP server on modest hardware handles well over a thousand concurrent connections — but only if you don't block the event loop with synchronous I/O.
26. Security Checklist
- [ ] All secrets from environment/secrets manager, never in code or git history
- [ ]
.envin.gitignorebefore the first commit - [ ] Constant-time token comparison; OAuth 2.1 for public remote servers
- [ ] TLS on every remote endpoint
- [ ] Input validated with Pydantic types at the boundary
- [ ] SQL via parameterized queries only — never string interpolation
- [ ] Error messages leak nothing sensitive (paths, SQL, hostnames)
- [ ] Container runs as non-root with resource limits
- [ ] Third-party stdio servers audited before install (they run with your permissions)
- [ ] Least privilege: the server's DB/API creds can only do what its tools need
- [ ] Rate limiting on public endpoints
- [ ] Dependencies pinned and scanned (
uv, Dependabot)
27. Common Interview Questions
Short, correct answers to what shows up in GenAI/SDET loops:
Q: What is MCP in one line?
An open JSON-RPC 2.0 protocol that standardizes how AI apps connect to external tools and data.
Q: Tools vs Resources vs Prompts?
Tools are model-controlled actions (side effects allowed); resources are app-controlled read-only data by URI; prompts are user-invoked templates.
Q: What transports exist?
stdio (local), Streamable HTTP (remote, preferred), and legacy SSE (deprecated).
Q: Why can't you print() in a stdio server?
stdout is the JSON-RPC channel; printing corrupts the protocol. Log to stderr.
Q: How does the model know which tool to use?
From the tool name, its typed schema, and the docstring description — write them for the model.
Q: How do you secure a remote MCP server?
TLS + auth (bearer token or OAuth 2.1), input validation, least-privilege backend creds, rate limiting.
Q: How do you test an MCP tool?
As a plain Python function — unit test the logic directly and via call_tool; no live host needed.
Q: What is the initialize handshake?
The capability negotiation the host and server perform on connection before exchanging tool/resource messages.
28. Real-World Production Use Cases
- Support automation — order lookup, refunds, ticket creation exposed as tools so an agent resolves cases end to end.
- Internal data access — a governed SQL tool that runs parameterized, allow-listed queries against a warehouse, giving analysts natural-language access without raw DB credentials.
- Test generation & execution (SDET) — tools that generate test plans, trigger CI runs, and read results; resources that expose the current test suite.
- DevOps copilots — tools to read logs, check deploy status, and roll back, with every destructive action gated behind confirmation.
- Document/knowledge access — resources exposing internal docs and runbooks the model loads on demand instead of stuffing into a static prompt.
The pattern is constant: wrap an existing system in tools + resources, add auth and logging, ship it.
29. Complete GitHub Project Structure
toolhub/
├── pyproject.toml # deps, pinned mcp>=1.27,<2
├── uv.lock # reproducible builds
├── .env.example # documented, committed
├── .gitignore # .env, .venv, __pycache__
├── Dockerfile # non-root, layered, slim
├── .dockerignore
├── README.md # setup + host configs
├── src/toolhub/
│ ├── server.py # FastMCP wiring, transport switch
│ ├── config.py # typed env settings
│ ├── logging_conf.py # structured logs → stderr
│ ├── auth.py # token / OAuth verification
│ ├── errors.py # typed, user-safe exceptions
│ ├── backends.py # pooled async clients
│ ├── tools/orders.py # model-callable actions
│ ├── resources/catalog.py # URI-addressed read-only data
│ └── prompts/templates.py # reusable prompt templates
├── tests/
│ ├── test_tools.py
│ ├── test_resources.py
│ └── test_integration.py
└── .github/workflows/ci.yml # lint · type · test · build
pyproject.toml essentials (partial):
[project]
name = "toolhub"
version = "0.1.0"
requires-python = ">=3.10"
dependencies = [
"mcp[cli]>=1.27,<2",
"pydantic",
"python-dotenv",
"structlog",
"httpx",
]
[project.scripts]
toolhub = "toolhub.server:main"
30. Final Production Checklist
- [ ] Server runs on both stdio and Streamable HTTP
- [ ] Tools return typed Pydantic models with model-facing docstrings
- [ ] Resources are read-only and URI-addressed
- [ ] Prompts registered and invocable from hosts
- [ ] All logging goes to stderr; stdout is protocol-only
- [ ] Typed exceptions; no sensitive data in error messages
- [ ] Config fully env-driven;
.envgit-ignored;.env.examplecommitted - [ ] Auth enforced on HTTP (bearer/OAuth 2.1), TLS terminated upstream
- [ ] Dockerized: non-root, pinned base, layered, resource-limited
- [ ] Unit + integration tests green; error paths covered
- [ ] CI runs lint + type-check + tests and gates the image build
- [ ]
/healthzand/metricsexposed and scraped - [ ] Async I/O with pooled backend clients; results paginated
- [ ] Security checklist complete
- [ ] Verified live in Claude Desktop, VS Code, and Cursor
Conclusion & Next Steps
You started with a 20-line ping and ended with a Dockerized, authenticated, tested, monitored MCP server wired into three hosts with a CI/CD pipeline behind it. The core loop never changed: type your functions, describe them for the model, wrap your backends, add auth and logging, ship.
Your next steps:
-
Replace the fake stores in
orders.pyandcatalog.pywith your real Postgres/REST backend using the async pattern from Section 25. - Turn on OAuth 2.1 once you have an identity provider, graduating off the static bearer token.
-
Add a metrics wrapper around every tool so
/metricsreflects real traffic, then wire Prometheus alerts. -
Publish the image to your registry from the CI
dockerjob and deploy behind TLS. - Write one more tool a week — the fastest way to internalize MCP is to keep wrapping systems your team already uses.
MCP is becoming the default integration layer for agentic AI. A server you can build, test, and operate end to end is a genuinely marketable skill in 2026 — now you have one.
Top comments (1)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.