DEV Community

Cover image for Engineering a Multi-Capability MCP Server in Python
OnlineProxy
OnlineProxy

Posted on

Engineering a Multi-Capability MCP Server in Python

The era of chatbot isolation is ending. For a long time, Large Language Models (LLMs) lived in a walled garden—brilliant reasoning engines trapped behind a chat interface, unable to touch your local files, access your database, or execute code on your machine. The Model Context Protocol (MCP) changes that paradigm entirely. It provides a standardized way to connect AI assistants to the systems where actual work happens.

However, many developers stop at the "Hello World" of MCP: exposing a simple API tool. While useful, this barely scratches the surface of the protocol’s potential. A truly robust accessible server doesn't just offer tools; it provides context through Resources and structured workflows through Prompts.

In this guide, we will engineer a comprehensive Python-based MCP server from scratch. We will move beyond simple scripts to build a multi-tool architecture that includes mathematical capabilities, documentation access, and dynamic prompt templates for meeting analysis. We will also adopt a modern "vibe coding" workflow—using LLMs to build LLMs tools—and cover the critical, often-overlooked art of debugging with the MCP Inspector.

When Should You Actually Build a Server?

Before we open the terminal, we must ask a critical architectural question: Do I need to build this?

Senior engineering isn't just about writing code; it's about knowing when not to. If you are looking to integrate standard services—like Google Drive or Slack—it makes little sense to develop a server that likely already exists in the community ecosystem. Redundancy is the enemy of efficiency.

Furthermore, for simple automation, low-code solutions (like n8n) might offer a faster path to value. However, building a custom Python MCP server becomes non-negotiable when:

  1. Complexity: You need specific, complex logic (like custom calculations or data transformation) that standard APIs don't offer.
  2. Context: You need to feed local, proprietary documentation (Resources) into the model's context window.
  3. Standardization: You want to enforce specific interaction patterns (Prompts) for your team, ensuring everyone uses the same "Meeting Summary" structure rather than reinventing the wheel.

If your use case fits these criteria, it’s time to code.

The Modern Stack: Python, uv, and "Vibe Coding"

Gone are the days of manually writing every line of boilerplate. It is not 1999. To build this server efficiently, we will utilize a "vibe coding" methodology—leveraging an IDE like Cursor with Claude integration to generate the scaffolding based on high-quality documentation.

The Prerequisites:

  • Python: Version 3.12 or higher.
  • Package Manager: uv (for fast, reliable dependency management).
  • The SDK: mcp (the Python SDK).
  • The Framework: FastMCP (a high-level wrapper to simplify server creation).

Setting the Context for AI Assistance
To make the AI assist you effectively, you cannot rely on its training data alone, as MCP is a rapidly evolving protocol. You must "prime" your environment:

  1. Index the Documentation: Create an llms.txt file or similar documentation dump containing the core MCP specifications, the Python SDK README, and the FastMCP usage guide.
  2. Pass Context to Cursor: Index these files in your IDE so the model understands the specific version of the SDK you are using.

This preparation allows you to prompt the model with high-level architectural requests ("Create a server with a calculator tool") rather than fighting with syntax errors.

Layer 1: The Executable Logic (Tools)

Our server's foundation is Tools. These are functions the LLM can call to perform actions. We will build a "Calculator Server," but structure it to handle various logical operations.

The Implementation Strategy
Using the FastMCP library makes defining tools deceptively simple. However, the nuance lies in the descriptions.

from mcp.server.fastmcp import FastMCP
import math

# Initialize the server
mcp = FastMCP("calculator-server")

@mcp.tool()
def add(a: float, b: float) -> float:
    """Add two numbers together."""
    return a + b

@mcp.tool()
def divide(a: float, b: float) -> float:
    """Divide the first number by the second number. Includes zero checks."""
    if b == 0:
        return "Error: Cannot divide by zero"
    return a / b
Enter fullscreen mode Exit fullscreen mode

Key Insight: The docstring inside the function ("""Add two numbers together.""") is not for you; it is for the LLM. This description is the API documentation the model reads to decide when to call this tool. If you are vague here, the model will hallucinate capabilities or fail to invoke the tool when needed.

We can expand this to include subtraction, multiplication, power, square root, and even percentage calculations. By wrapping these in the @mcp.tool() decorator, we automatically handle the JSON-RPC communication required by the protocol.

The Feedback Loop: Mastering the MCP Inspector

Programming an MCP server without a debugging tool is like flying blind. You define a tool, restart Claude Desktop, try to use it, fail, and repeat. This allows for a painful development cycle.

Enter the MCP Inspector.

The Inspector allows you to test your server in isolation, decoupling the backend logic from the frontend client (Claude).

Running the Inspector
You launch the inspector directly against your file:
uv run mcp-inspector server.py

This spins up a local web interface (usually on localhost:xxxx) where you can:

  1. List Tools: Verify that your server is actually exposing the functions you wrote.
  2. Test Execution: Manually input arguments (e.g., a=10, b=2) and see the raw output or error traces.
  3. Check Connections: Verify transport protocols (stdio vs. HTTP).

The Security Gotcha: When the inspector launches, it might generate a URL with a security token. If you try to connect via a generic localhost URL without this token, the connection will be rejected. Always follow the specific link provided in your terminal logs to ensure authenticated access.

Use the inspector to rigorously test edge cases—like dividing by zero—before you ever attempt to connect the server to a real client.

Layer 2: Contextual Grounding (Resources)

Tools allow the model to act. Resources allow the model to read.

A common mistake is treating the LLM as a static knowledge base. By integrating Resources, you give the model direct, read-only access to specific data on your machine—logs, API documentation, or codebases.

Implementing a File-Based Resource
Let’s imagine we want our server to be an expert on the MCP TypeScript SDK. We can download the SDK methodology as a Markdown file and expose it as a resource.

from mcp.server.fastmcp import Context, Resource

# Define the path to your knowledge base
resource_path = "./docs/typescript_sdk.md"

@mcp.resource("mcp://docs/typescript-sdk")
def get_typescript_sdk() -> str:
    """Provides access to the TypeScript SDK documentation."""
    with open(resource_path, "r") as f:
        return f.read()
Enter fullscreen mode Exit fullscreen mode

When you add this, the MCP Inspector will show a new "Resources" tab. In a real-world scenario (like Claude Desktop), the user can now "attach" this resource to a chat. The model instantly gains the context of that file without you needing to copy-paste thousands of words into the prompt window.

Strategic Value: This turns your server into a dynamic library. You update the local file, and the model’s knowledge updates instantly.

Layer 3: User-Controlled Workflows (Prompts)

The final and perhaps most powerful layer is Prompts. While Tools are reactive (the model decides to use them), Prompts are proactive—they are standardized, user-initiated templates designed to enforce a specific workflow.

A perfect use case is a "Meeting Summary." Instead of typing "Please summarize this transcript with these specific headers..." every time, you bake that structure into the server.

Creating a Dynamic Prompt Template
We can use a Markdown file with dynamic placeholders (e.g., {{date}}, {{transcript}}) as our template.

The Template (prompt.md):

You are an executive assistant. Analyze the following meeting transcript.

**Date:** {{date}}
**Title:** {{title}}

**Transcript:**
{{transcript}}

Please provide a summary with:
1. Overview
2. Key Decisions
3. Action Items
Enter fullscreen mode Exit fullscreen mode

The Implementation:
We expose this via the @mcp.prompt() decorator. Critically, we must define the arguments that the UI should request from the user.

@mcp.prompt()
def meeting_summary(date: str, title: str, transcript: str) -> str:
    """Generates a meeting summary based on a transcript."""
    # Logic to read the template and replace placeholders
    # Returns the formatted prompt to the client
    return render_template("prompt.md", date=date, title=title, transcript=transcript)
Enter fullscreen mode Exit fullscreen mode

The Debugging Journey with Prompts
When implementing prompts via "vibe coding" (using LLMs to write the code), a common error occurs: the LLM might try to implement list_prompts and get_prompt capabilities manually. With the modern FastMCP framework, this is redundant and often causes conflicts.

Additionally, LLM-generated code often tries to include arguments like model or temperature inside the prompt logic. The Insight: Configuration (which model to use) belongs to the client settings, not the prompt template. Clean code requires removing these hallucinations.

When tested in the Inspector (and eventually Claude Desktop), this feature creates a form-like UI. The user selects "Meeting Summary," fills in the fields, and the LLM receives a perfectly engineered context package.

The Connection Layer: Configuration and Transport

Once the server possesses Tools, Resources, and Prompts, it must be bridged to the client. This is handled via the claude_desktop_config.json file.

Transport Protocols: stdio vs. HTTP
The default communication method is stdio (Standard Input/Output). The client spawns the server process and talks to it via the terminal streams. This is secure, fast, and perfect for local development.

{
  "mcpServers": {
    "my-python-server": {
      "command": "uv",
      "args": ["run", "server.py"],
      "env": {
        "PYTHONPATH": "."
      }
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

However, as you scale, you may want to expose your server over a network. This is where SSE (Server-Sent Events) over HTTP comes in.

Note on Evolution: The protocol creates a distinction between transport types. While SSE was once a standalone concept, correct modern implementation usually involves Streamable HTTP. This setup runs a web server (like Starlette or FastAPI) that exposes an endpoint. The Inspector and clients can connect to this URL (http://localhost:8000/sse), decoupling the server's lifecycle from the client's lifecycle. This allows multiple clients to connect to the same persistent server instance.

Final Thoughts

Building an MCP server is a transition from being a consumer of AI to being an architect of AI workflows.

By following this structure, we didn't just build a script; we built a capability layer:

  1. Tools gave the model hands to perform calculations.
  2. Resources gave the model eyes to read our local documentation.
  3. Prompts gave the model a playbook to follow standard operating procedures.

Don't let the "vibe coding" ease fool you—the rigor comes in the description of your tools, the structure of your prompts, and the validation of your resources. Start with stdio for simplicity, use the Inspector religiously to validate your logic, and only then integrate into your daily workflow.

The code is the easy part. The engineering lies in defining the context.

Top comments (0)