Aniket Hingane

Posted on Dec 25

Building Production-Grade AI Agents with MCP & A2A: A Guide from the Trenches

#ai #agents #python #mcp

Building Production-Grade AI Agents with MCP & A2A: A Guide from the Trenches

From Chaos to Contract: How I Tamed the Agentic Wild West

TL;DR

In this article, I share my journey of moving from fragile, custom-built AI agent architectures to a robust, standardized approach using the Model Context Protocol (MCP). I'll walk you through:

Why "Agent-to-Agent" (A2A) communication is the missing link in production systems.
How I designed a "Daily Minutes Assistant" using a standardized contract.
The exact code infrastructure I built (and how you can too).
Why I believe standardizing context is more important than standardizing prompts.

If you're tired of debugging why your agent hallucinated a function call, this read is for you.

Introduction

I still remember the late nights I spent debugging my first complex multi-agent system. I had a "Research Agent" that was supposed to talk to a "Writer Agent." It worked beautifully in my Jupyter notebook. But the moment I deployed it? Chaos. The Research Agent would output JSON; the Writer Agent wanted Markdown. The "Memory" module was a global dictionary that kept getting overwritten. It was a house of cards.

From my experience, this is where most AI engineering stalls today. We build impressive demos, but production reliability eludes us because we lack a fundamental protocol for communication.

Then I found generic protocols like MCP (Model Context Protocol). I realized that the problem wasn't my prompt engineering—it was my architecture. I didn't need smarter models; I needed better contracts. In my opinion, adopting a strict protocol is the difference between a toy and a tool.

What's This Article About?

This isn't a high-level fluff piece about "The Future of AI." This is a muddy-boots, code-heavy walkthrough of how I built a production-grade Agent-to-Agent (A2A) system.

I’ll show you how I built a Daily Minutes Assistant—a system that connects to my calendar, pulls meeting transcripts, summarizes them, and emails me the action items. But the "magic" isn't the summary; it's the plumbing.

Tech Stack

I chose this stack because I wanted reliability over hype:

Python 3.12+: For strong typing and async support.
MCP SDK (mcp-python): The core backbone for standardized tool and resource definition.
FastAPI / FastMCP: For rapidly standing up the server interface.
Pydantic: For rock-solid data validation.

Why Read It?

If you've ever felt the pain of:

Custom-writing API wrappers for every new tool.
Agents getting stuck in loops because they don't know "when to stop."
Trying to connect a local specialized agent to a cloud-based LLM.

...then this experimental PoC I built will speak directly to your soul. I wrote this because I wished someone had shown me this pattern six months ago.

Let's Design

Before I wrote a single line of code, I stepped back to design the interaction. I thought: "If these agents were employees, how would they pass documents?"

In my view, an agent needs three things:

Tools: Things it can do (Search Web, Send Email).
Resources: Things it can read (Calendar, Logs).
Prompts: Standardized ways to ask for things.

I sketched out this flow:

I designed it this way because I wanted the SummaryServer to be completely ignorant of the ResearchServer. Decoupling them meant I could swap out the research engine later without breaking the calendar integration.

Let’s Get Cooking

Now, let's look at the actual implementation. I structured the project to separate the "Server" (which exposes capabilities) from the "Client" (which consumes them).

Step 1: The FastMCP Server

First, I built a server that exposes simple capabilities. I used FastMCP because it handles the heavy lifting of the protocol handshake.

import os
from mcp.server.fastmcp import FastMCP

# Initialize the FastMCP server
# In my opinion, naming your server clearly is crucial for mult-agent debugging
mcp = FastMCP("DailyAssistant")

@mcp.tool()
async def search_web(query: str, limit: int = 5) -> str:
    """
    Search the web for a given query.

    Args:
        query: The search query
        limit: Max results to return
    """
    # In a real deployed version, I connect this to Tavily or Serper
    # For this PoC, I'm mocking the return to focus on the protocol
    return f"Mock search results for '{query}':\n1. Result A\n2. Result B"

@mcp.resource("config://app_settings")
def get_app_settings() -> str:
    """Get application configuration settings"""
    return "Theme: Dark\nNotifications: Enabled"

What This Does:
It defines a server that offers a search_web tool and a config://app_settings resource.

Why I Structured It This Way:
I used decorators (@mcp.tool()) because they keep the code near the definition. I observed that defining tools in a separate JSON file often leads to drift where the implementation changes but the schema doesn't.

What I Learned:
The type hints (query: str) aren't just for show. MCP uses them to generate the JSON schema that the LLM eventually sees. If you're lazy with types here, your agent will be confused later.

Step 2: The Agent Client

Next, I wrote the client. This is where the standard comes alive. The client doesn't need to know how search_web is implemented, only that it is available.

import asyncio
import sys
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

async def run_client():
    # I decided to use Stdio for local communication
    # It's faster and secure by default for sidecar patterns
    server_params = StdioServerParameters(
        command=sys.executable,
        args=["src/server/agent_server.py"],
        env=None
    )

    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()

            # Dynamic Discovery
            tools = await session.list_tools()
            print(f"Connected! Found tools: {[t.name for t in tools.tools]}")

            # Execution
            result = await session.call_tool("search_web", arguments={"query": "MCP adoption"})
            print(f"Tool Output: {result.content[0].text}")

What This Does:
It launches the server as a subprocess and connects via standard input/output. It then dynamically asks "What can you do?" (list_tools) before asking it to do something.

My Experience Here:
I initially tried to use HTTP for everything. But I found that for local agents—like a coding assistant running on my laptop—stdio is vastly superior. It has zero network overhead and simplifies the auth story (if you can run the process, you have access).

Let's Setup

If you want to run this PoC yourself, I've kept it dead simple.

Prerequisites

Python 3.10+
A virtual environment (always use venvs!)

Clone the repository:
(I'll provide the link to my public repo below)
Install dependencies:
pip install mcp httpx
Verify installation:
Run python -c "import mcp; print(mcp.__version__)" to make sure you're on the latest SDK.

Let's Run

Executing the system is straightforward.

Execute the client script:

python src/client/agent_client.py

You should see output indicating the handshake succeeded, followed by the mock search result.

What to watch for:
If you see a generic ConnectionRefused or a pipe error, I found it's usually because the server script crashed on startup (e.g., missing imports) before the handshake could complete. Always verify your server runs standalone first!

Closing Thoughts

Building this experimental "Daily Minutes Assistant" taught me that the future of AI isn't just about bigger context windows—it's about structured context.

In my view, we are moving from "Prompt Engineering" to "Context Engineering." The MCP approach allows us to treat tools and resources as first-class citizens. I think this is how we finally bridge the gap between cool Twitter demos and boring, reliable, production software.

I hope this guide saves you some of the headaches I faced. The code is available, so fork it, break it, and let me know what you build.

Tags: ai, python, mcp, agents

Disclaimer:
The views and opinions expressed here are solely my own and do not represent the views, positions, or opinions of my employer or any organization I am affiliated with. The content is based on my personal experience and experimentation and may be incomplete or incorrect. Any errors or misinterpretations are unintentional, and I apologize in advance if any statements are misunderstood or misrepresented.

DEV Community

Building Production-Grade AI Agents with MCP & A2A: A Guide from the Trenches

Building Production-Grade AI Agents with MCP & A2A: A Guide from the Trenches

TL;DR

Introduction

What's This Article About?

Tech Stack

Why Read It?

Let's Design

Let’s Get Cooking

Step 1: The FastMCP Server

Step 2: The Agent Client

Let's Setup

Prerequisites

Let's Run

Closing Thoughts

Top comments (0)