I've reviewed dozens of custom MCP servers built by developers connecting AI assistants to their internal tools. The build tutorials are everywhere — the mistake patterns aren't.
Here are the five most common mistakes that make MCP servers unreliable, slow, or silently broken.
TL;DR
| # | Mistake | Impact | Fix |
|---|---|---|---|
| 1 | Printing to stdout | Server disconnects immediately | Route all diagnostics to stderr |
| 2 | Vague tool descriptions | AI calls wrong tools or hallucinates params | Write descriptions the AI reads at call time |
| 3 | Synchronous blocking I/O | One slow tool freezes all others | Use async def and connection pooling |
| 4 | No input validation | Garbage inputs crash the server | Use Pydantic models for every tool schema |
| 5 | No tool-level error handling | AI gets raw stack traces | Wrap tools, return structured errors |
1. The stdout Trap — Printing Diagnostics That Kill Your Server
This is the single most common reason MCP servers "just disconnect" with no useful error message.
When you run an MCP server over stdio transport (the default for Claude Desktop, Cursor, and local agents), stdout is the protocol channel. Every byte you write to stdout must be valid JSON-RPC. A stray print() statement corrupts the entire stream.
The mistake:
@mcp.tool()
def query_data(filters: dict) -> list:
print(f"Querying with filters: {filters}") # BOOM
results = db.query(filters)
return results
That print() works fine when you test locally. The moment Claude Desktop connects, it receives your debug line instead of a JSON-RPC message and drops the connection with a generic "MCP server disconnected" error.
The fix:
import logging
import sys
logging.basicConfig(
level=logging.INFO,
stream=sys.stderr, # ALWAYS stderr for MCP servers
format="%(asctime)s %(levelname)s %(message)s"
)
logger = logging.getLogger("mcp-server")
@mcp.tool()
def query_data(filters: dict) -> list:
logger.info(f"Querying with filters: {filters}") # Safe
results = db.query(filters)
return results
Rule of thumb: if it would appear in a terminal, it goes to stderr. If it's data for the AI client, it goes in the return value.
2. Vague Tool Descriptions — The AI Doesn't Know When to Call Your Tool
Your tool's docstring is the only context the AI has when deciding whether to call your function. It's not documentation for humans — it's a prompt for the model's tool router.
The mistake:
@mcp.tool()
def get_data(query: str) -> dict:
"""Get data."""
return database.search(query)
"Get data" tells the AI nothing. It now has to guess: what kind of data? When should I call this vs. another tool? What does query mean — a SQL string? A search term? An ID?
The fix:
@mcp.tool()
def get_data(
search_term: Annotated[
str,
Field(description="Free-text product name or SKU, e.g. 'headphones' or 'SKU-001'")
]
) -> list[Product]:
"""Search the product catalog by name or SKU. Call when the user asks about inventory, stock levels, or specific products."""
return database.search(search_term)
The docstring now specifies when to call, and the Field(description=...) specifies what to pass. Together they eliminate the AI's need to guess.
Test your descriptions by asking: if the AI read ONLY this docstring (no function name, no code), would it know exactly when to call this tool?
3. Synchronous Blocking I/O — One Slow Tool Freezes Everything
MCP servers serving stdio or SSE handle one request at a time in sync mode. If one tool makes a slow HTTP call or database query with a 30-second timeout, every other tool waits.
The mistake:
import psycopg2
@mcp.tool()
def run_report(date_range: str) -> dict:
conn = psycopg2.connect(DSN) # Blocks, 3-5s
cursor = conn.cursor()
cursor.execute("SELECT ...") # Blocks, 15-30s for large reports
return cursor.fetchall()
While run_report executes, list_products, ping, and every other tool is blocked. The AI client times out.
The fix: Use async I/O with connection pooling:
import asyncpg
from mcp.server.fastmcp import FastMCP
mcp = FastMCP("async-server")
_pool = None
async def get_pool():
global _pool
if _pool is None:
_pool = await asyncpg.create_pool(DSN, min_size=2, max_size=10)
return _pool
@mcp.tool()
async def run_report(date_range: str) -> list[dict]:
pool = await get_pool()
async with pool.acquire() as conn:
rows = await conn.fetch("SELECT ... WHERE ...", date_range)
return [dict(r) for r in rows]
Now the server can handle multiple tool calls concurrently. If the AI calls run_report and list_products in parallel, both execute simultaneously instead of queueing.
4. No Input Validation — Garbage Inputs Crash Your Server
Without explicit schemas, the AI may pass None, empty strings, or completely wrong types. An unhandled TypeError in your tool handler propagates as a cryptic server error to the client.
The mistake:
@mcp.tool()
def update_stock(sku: str, quantity: int) -> str:
database.update(sku, quantity)
return f"Updated {sku}"
If the AI passes quantity=None or sku=42 (int instead of str), you get a crash — or worse, silent corruption.
The fix: Use Pydantic models with constraints:
from pydantic import BaseModel, Field
class StockUpdate(BaseModel):
sku: str = Field(pattern=r"^SKU-\d{3,}$", description="SKU format: SKU-001")
quantity: int = Field(ge=0, le=99999, description="Must be non-negative")
@mcp.tool()
def update_stock(update: StockUpdate) -> str:
database.update(update.sku, update.quantity)
return f"Updated {update.sku}: new stock = {update.quantity}"
Pydantic validates before your function runs. Invalid inputs get a structured error back to the AI instead of an unhandled exception. The AI can then retry with corrected parameters.
5. Bare Exceptions — Raw Stack Traces Leak to the AI Client
When a database connection fails, catching nothing means the AI client sees a 50-line Python traceback. Not only does this waste context tokens — it also leaks internal file paths, library versions, and connection strings.
The mistake:
@mcp.tool()
def get_product(sku: str) -> dict:
conn = psycopg2.connect("postgresql://admin:secret@db.internal:5432/prod")
return conn.execute("SELECT * FROM products WHERE sku = %s", (sku,)).fetchone()
If the database is down, every token in that traceback is a token stolen from the AI's reasoning budget, plus you just leaked your connection string.
The fix: Wrap, sanitize, return structured errors:
@mcp.tool()
def get_product(sku: str) -> dict:
try:
with get_db() as conn:
row = conn.execute(
"SELECT id, sku, name, stock FROM products WHERE sku = %s",
(sku,)
).fetchone()
if row is None:
return {"error": "not_found", "sku": sku}
return dict(row)
except psycopg2.OperationalError:
return {"error": "database_unavailable", "retry_after": 30}
except Exception as e:
return {"error": "internal_error", "detail": str(e)[:100]}
The AI now gets a structured response it can reason about: "database_unavailable" tells it to wait and retry. "not_found" tells it to suggest alternatives. A raw traceback tells it nothing.
Quick Checklist Before You Ship
Run through this before connecting any MCP server to production:
- [ ] All
print()statements replaced withlogger.info()tostderr - [ ] Every tool docstring answers: when to call, what it returns
- [ ] Every
Field()has adescription=the AI will see - [ ] Async I/O for any tool hitting a database, API, or filesystem
- [ ] Pydantic models with constraints on every tool that takes parameters
- [ ] Every tool wrapped in try/except, returning structured
{error: ...}dicts - [ ] Test with
mcp devbefore connecting to Claude Desktop or Cursor
A well-built MCP server feels invisible to the AI — it just works. These five patterns are the the difference between "the agent keeps using my tools" and "the agent gave up and asked the user instead."
Top comments (1)
The stdout issue is one of those things that becomes obvious the second someone explains it, but until then it's just chaos—your server works perfectly in isolation and then silently dies the moment a real client connects. The diagnostic tools are gaslighting you.
What I think this pattern points to is a deeper tension in how we build for AI consumption versus human consumption. For humans, a
print()is harmless debugging. For the protocol, it's corruption. But the fix isn't just "use stderr"—it's that we're writing functions whose entire surface area is now machine-readable contracts, and we're still using development habits from a world where humans were the only audience for our logs, our error messages, and our function signatures.The structured error handling point (#5) really lands here. A human developer can skim a traceback and find the useful line. The AI client just sees tokens burning and maybe your database hostname. It makes me wonder if we need a new category of testing that's specifically "how does this look to the AI's parser" rather than "does this work when I call it manually." Have you found any good patterns for simulating the AI's perspective during development, beyond just connecting it to Claude Desktop and hoping?