DEV Community

Cover image for Stop Writing Custom AI Integrations: Build Python AI Agents with MCP in 2026
Emma Schmidt
Emma Schmidt

Posted on

Stop Writing Custom AI Integrations: Build Python AI Agents with MCP in 2026

Picture this: you wire up an LLM to query your database. It works great. Then your product team asks you to also pull data from Slack. Another custom connector. Then GitHub. Another. Then Notion. Another. By the time you have five data sources connected, you are maintaining five bespoke integration layers, each with its own authentication model, error handling, retry logic, and schema negotiation. None of them reusable across a different AI app. This is the integration tax silently choking agentic AI development in 2026, and if you are thinking about scaling this work or looking to Hire Python Developers who already understand the new standard, this article will show you exactly what that standard looks like in practice.

That standard is MCP: the Model Context Protocol, and it has changed everything.


What is MCP and Why Should You Care Right Now

The Model Context Protocol is an open JSON-RPC 2.0 specification that Anthropic released in November 2024 to standardize how large language models discover and invoke external capabilities. It crossed 97 million monthly SDK downloads in March 2026 and now has over 13,000 public servers on GitHub.

OpenAI deprecated its proprietary Assistants API in favor of MCP. Google, Microsoft, LangGraph, CrewAI, and AWS all adopted it. The Linux Foundation now governs it through the Agentic AI Foundation.

Before MCP, every AI integration was a custom function-calling bridge written against one vendor's API. After MCP, a single server you write once works across Claude Desktop, ChatGPT, Microsoft Copilot Studio, VS Code, Cursor, and any agent framework that speaks the protocol.

If your agents do not speak MCP in 2026, they are speaking a dead dialect.


The Real Problem This Solves

Here is a concrete scenario that most backend engineers building AI systems have already lived through.

You build a customer support AI agent. It needs to:

  • Query customer records from your PostgreSQL database
  • Check open tickets in Jira
  • Read conversation history from Slack
  • Look up order status from your internal billing API

Without MCP, you write four separate tool implementations. Each one has custom auth logic, custom error handling, custom retry behavior. When you build a second agent (say, a sales assistant), you rewrite all of it again.

With MCP, you write four MCP servers once. Every agent you ever build simply connects to them. The integration tax drops to near zero.


Setting Up Your Python MCP Environment

Let us get hands-on. You need Python 3.10 or later.

# Create a clean virtual environment
python -m venv mcp-agent-env
source mcp-agent-env/bin/activate  # Windows: mcp-agent-env\Scripts\activate

# Install the official MCP SDK and FastMCP
pip install "mcp[cli]>=1.27.0" fastmcp httpx
Enter fullscreen mode Exit fullscreen mode

Verify your install:

python -c "import mcp; print(mcp.__version__)"
# Should print 1.27.x
Enter fullscreen mode Exit fullscreen mode

Building Your First MCP Server in Python

Here is a minimal but real MCP server that exposes two tools: one that queries a database and one that reads from a REST API. This is the pattern you will use in production.

# server.py
from fastmcp import FastMCP
import httpx
import sqlite3
from typing import Any

mcp = FastMCP("customer-data-server")

@mcp.tool()
def get_customer(customer_id: str) -> dict[str, Any]:
    """
    Retrieve customer information by ID from the database.
    Returns customer name, email, plan, and account status.
    """
    conn = sqlite3.connect("customers.db")
    cursor = conn.cursor()
    cursor.execute(
        "SELECT id, name, email, plan, status FROM customers WHERE id = ?",
        (customer_id,)
    )
    row = cursor.fetchone()
    conn.close()

    if not row:
        return {"error": f"Customer {customer_id} not found"}

    return {
        "id": row[0],
        "name": row[1],
        "email": row[2],
        "plan": row[3],
        "status": row[4],
    }

@mcp.tool()
async def get_open_tickets(customer_id: str) -> list[dict[str, Any]]:
    """
    Fetch all open support tickets for a given customer from the ticketing API.
    Returns a list of tickets with id, title, priority, and created_at.
    """
    async with httpx.AsyncClient() as client:
        response = await client.get(
            f"https://your-ticketing-api.com/tickets",
            params={"customer_id": customer_id, "status": "open"},
            headers={"Authorization": "Bearer YOUR_API_TOKEN"},
            timeout=10.0,
        )
        response.raise_for_status()
        return response.json().get("tickets", [])

if __name__ == "__main__":
    mcp.run()
Enter fullscreen mode Exit fullscreen mode

Run the server:

python server.py
# MCP server running on stdio transport, ready for connections
Enter fullscreen mode Exit fullscreen mode

Two real tools, typed schemas, async support, zero boilerplate. FastMCP handles the JSON-RPC layer, tool discovery, schema generation, and error marshaling for you.


Choosing the Right Transport: stdio vs Streamable HTTP

This is one of the most important decisions you will make in production.

stdio is for local workflows:

  • Client spawns the MCP server as a child process
  • Communication through stdin and stdout
  • Zero infrastructure, zero network config, perfect isolation
  • Use for CLI tools, desktop applications, and local dev

Streamable HTTP is for everything else:

  • Client sends JSON-RPC messages via HTTP POST
  • Server responds with SSE stream or JSON object
  • Works through firewalls, load balancers, and CDNs
  • Required for multi-client production deployments

Note: The old SSE-only transport was deprecated in the June 2025 spec revision. Start new projects on Streamable HTTP.

To run your server over HTTP instead of stdio:

if __name__ == "__main__":
    mcp.run(transport="streamable-http", host="0.0.0.0", port=8080)
Enter fullscreen mode Exit fullscreen mode

Connecting an AI Agent to Your MCP Server

Now the magic part. Here is how you wire up a Python AI agent that uses your MCP server as its toolbox.

# agent.py
import asyncio
from anthropic import Anthropic
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

client = Anthropic()

async def run_agent(user_query: str):
    server_params = StdioServerParameters(
        command="python",
        args=["server.py"],
    )

    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()
            tools_result = await session.list_tools()

            tools = [
                {
                    "name": tool.name,
                    "description": tool.description,
                    "input_schema": tool.inputSchema,
                }
                for tool in tools_result.tools
            ]

            messages = [{"role": "user", "content": user_query}]

            while True:
                response = client.messages.create(
                    model="claude-sonnet-4-20250514",
                    max_tokens=2048,
                    tools=tools,
                    messages=messages,
                )

                if response.stop_reason == "end_turn":
                    for block in response.content:
                        if hasattr(block, "text"):
                            print(f"\nAgent: {block.text}")
                    break

                messages.append({"role": "assistant", "content": response.content})
                tool_results = []

                for block in response.content:
                    if block.type == "tool_use":
                        print(f"  [Using tool: {block.name}({block.input})]")
                        result = await session.call_tool(block.name, block.input)
                        tool_results.append({
                            "type": "tool_result",
                            "tool_use_id": block.id,
                            "content": str(result.content),
                        })

                messages.append({"role": "user", "content": tool_results})

asyncio.run(run_agent(
    "What is the current status of customer C-1042 and do they have any open tickets?"
))
Enter fullscreen mode Exit fullscreen mode

Sample output:
[Using tool: get_customer({'customer_id': 'C-1042'})]
[Using tool: get_open_tickets({'customer_id': 'C-1042'})]
Agent: Customer Sarah Chen is on the Pro plan and her account is active.
She has 2 open tickets: a high-priority billing discrepancy (3 days old)
and a medium-priority API rate limit issue (opened yesterday).

The agent autonomously decided which tools to call, in what order, and synthesized the results into a coherent answer. No hardcoded orchestration logic.


Exposing Resources and Prompts (Not Just Tools)

MCP has three primitive types and most tutorials only cover tools. Here is a quick look at all three.

Resources: Static or dynamic data your agent can read

@mcp.resource("docs://onboarding-guide")
def get_onboarding_guide() -> str:
    """
    The customer onboarding guide. Use this when a customer
    asks how to get started.
    """
    with open("docs/onboarding.md", "r") as f:
        return f.read()
Enter fullscreen mode Exit fullscreen mode

Prompts: Reusable prompt templates your agent can invoke

@mcp.prompt()
def summarize_ticket(ticket_id: str, audience: str = "customer") -> str:
    """
    Generate a summary of a support ticket tailored to the given audience.
    audience can be 'customer', 'engineer', or 'manager'.
    """
    return f"""
    Please retrieve ticket {ticket_id} and summarize it for a {audience}.
    For customers: focus on current status and next steps.
    For engineers: include technical details and reproduction steps.
    For managers: focus on business impact and time to resolution.
    """
Enter fullscreen mode Exit fullscreen mode

These three primitives cover the full surface area of what an agent needs: tools to act, resources to read, and prompts to reason consistently.


Production Pattern: OAuth 2.1 for Remote Servers

If you are deploying an MCP server that external clients connect to over Streamable HTTP, you need authentication. MCP 1.27 ships with OAuth 2.1 support built into the SDK.

from fastmcp import FastMCP
from fastmcp.auth import BearerAuthProvider

auth = BearerAuthProvider(
    jwks_uri="https://your-auth-provider.com/.well-known/jwks.json",
    audience="your-mcp-server",
    issuer="https://your-auth-provider.com",
)

mcp = FastMCP("customer-data-server", auth=auth)

@mcp.tool()
def get_customer(customer_id: str) -> dict:
    # Token already validated before this runs
    ...
Enter fullscreen mode Exit fullscreen mode

Your server will now reject requests without a valid bearer token, return proper 401 responses, and expose its JWKS endpoint for client discovery. All standard. All automatic.


Common Mistakes to Avoid in Production

1. Designing tools that are too broad

Bad: manage_customer(action: str, customer_id: str, data: dict) where action is "create", "update", or "delete". The agent has to guess what data should look like for each action. It will hallucinate.

Good: create_customer(name, email, plan), update_customer_plan(customer_id, new_plan), deactivate_customer(customer_id, reason). One tool, one schema, one clear purpose.

2. Not handling timeouts

External API calls in tools must have explicit timeouts. An agent stuck waiting on a hung HTTP request will time out the entire conversation.

async with httpx.AsyncClient(timeout=httpx.Timeout(10.0, connect=3.0)) as client:
    ...
Enter fullscreen mode Exit fullscreen mode

3. Returning raw exceptions to the agent

Wrap tool logic in try-except and return structured error dicts. The agent can reason about {"error": "customer not found", "code": 404}. It cannot reason about a Python traceback.

4. Using stdio transport in multi-user deployments

stdio works for one client at a time. The moment you have concurrent agent sessions hitting the same server, switch to Streamable HTTP.


Testing Your MCP Server Without a Full Agent

The MCP Inspector is a browser-based tool that ships with the SDK. Test every tool, resource, and prompt interactively before connecting a real agent.

pip install "mcp[inspector]"

mcp dev server.py
# Opens http://localhost:5173 in your browser
Enter fullscreen mode Exit fullscreen mode

From the inspector you can call any tool with arbitrary inputs, see the raw JSON-RPC request and response, inspect all registered resources and prompts, and verify your schemas look exactly as the agent will see them. This eliminates an entire class of bugs before you ever involve a live LLM.


What is Coming Next in the MCP Ecosystem

The protocol is still moving fast. Here is what ships in mid-2026:

  • Tasks extension (SEP-1686): Standardized async task handling with retry semantics, result expiry, and task migration across server restarts. Critical for long-running agent workflows.
  • Enterprise auth: Cross-App Access and OIDC integration for regulated industries like finance and healthcare.
  • Multimodal tools: Tools that accept and return images, audio, and video through the same JSON-RPC interface.

If you are building in a regulated industry, monitor SEPs tagged enterprise in the official spec repository.


Recap: The MCP Mental Model

Your AI Agent
|
| speaks JSON-RPC 2.0
|
MCP Client (your Python code)
|
| connects over stdio or Streamable HTTP
|
MCP Server (FastMCP + your business logic)
|
+-- Tools (actions the agent can take)
+-- Resources (data the agent can read)
+-- Prompts (reusable reasoning templates)

Write the server once. Connect any agent to it. Connect any agent framework to it. Move between Claude, GPT, Gemini, and your own models without rewriting a single tool.


Found this useful? Drop a comment with what you are building, or share the specific integration problem MCP helped you solve.

Top comments (0)