Akshay Kumar BM

Posted on Mar 15

Stop Loading 30 MCP Tools Into One Agent — 3 Design Patterns That Actually Work

I hit a wall building an AI agent with MCP.

The agent had access to 30+ tools — database queries, file ops, Slack notifications, ticket management, you name it. And it kept making dumb decisions. Wrong tool calls. Hallucinated parameters. Context that made no sense.

Then I read the research: at 32K tokens of context, 11 models tested dropped below 50% of their short-context baseline. Every tool schema you load burns tokens. More tools = more context rot = worse decisions.

Here are the 3 design patterns I switched to — with working Python MCP server code for each one.

Prerequisites

Python 3.11+
mcp SDK: pip install mcp
Basic familiarity with MCP servers and Claude agents

🔧 Pattern 1: Sub-Agent Grouping

The idea: Don't wire 30 tools to one agent. Group tools by domain. Give each group its own focused MCP server. An orchestrator delegates; sub-agents specialize.

30 tools to 1 agent  ❌
     ↓
Orchestrator → DB Agent (5 tools)
             → Comms Agent (6 tools)
             → Storage Agent (4 tools)  ✅

Each sub-agent only sees 5–8 tools. Its attention stays sharp.

The Code

db_agent_server.py — Database sub-agent:

from mcp.server.fastmcp import FastMCP
import sqlite3

mcp = FastMCP("db-agent")

@mcp.tool()
def query_records(table: str, filters: dict = {}) -> list[dict]:
    """Query records from a table with optional filters."""
    conn = sqlite3.connect("app.db")
    cursor = conn.cursor()

    where_clause = ""
    values = []
    if filters:
        conditions = [f"{k} = ?" for k in filters]
        where_clause = "WHERE " + " AND ".join(conditions)
        values = list(filters.values())

    cursor.execute(f"SELECT * FROM {table} {where_clause}", values)
    columns = [d[0] for d in cursor.description]
    return [dict(zip(columns, row)) for row in cursor.fetchall()]

@mcp.tool()
def insert_record(table: str, data: dict) -> str:
    """Insert a new record into a table."""
    conn = sqlite3.connect("app.db")
    columns = ", ".join(data.keys())
    placeholders = ", ".join(["?"] * len(data))
    conn.execute(
        f"INSERT INTO {table} ({columns}) VALUES ({placeholders})",
        list(data.values())
    )
    conn.commit()
    return f"Inserted record into {table}"

@mcp.tool()
def delete_record(table: str, record_id: int) -> str:
    """Delete a record by ID."""
    conn = sqlite3.connect("app.db")
    conn.execute(f"DELETE FROM {table} WHERE id = ?", [record_id])
    conn.commit()
    return f"Deleted record {record_id} from {table}"

if __name__ == "__main__":
    mcp.run()

comms_agent_server.py — Notifications sub-agent:

from mcp.server.fastmcp import FastMCP
import httpx

mcp = FastMCP("comms-agent")

SLACK_WEBHOOK = "https://hooks.slack.com/services/YOUR/WEBHOOK/URL"

@mcp.tool()
def send_slack_message(channel: str, message: str) -> str:
    """Send a message to a Slack channel."""
    payload = {"channel": channel, "text": message}
    resp = httpx.post(SLACK_WEBHOOK, json=payload)
    return "Message sent" if resp.status_code == 200 else f"Failed: {resp.text}"

@mcp.tool()
def send_email(to: str, subject: str, body: str) -> str:
    """Send an email notification."""
    # plug in your SMTP/SES logic here
    print(f"Email to {to}: [{subject}] {body}")
    return f"Email sent to {to}"

if __name__ == "__main__":
    mcp.run()

orchestrator_server.py — The orchestrator that delegates:

from mcp.server.fastmcp import FastMCP
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

mcp = FastMCP("orchestrator")

async def call_sub_agent(server_script: str, tool_name: str, args: dict):
    """Call a tool on a sub-agent MCP server."""
    params = StdioServerParameters(command="python", args=[server_script])
    async with stdio_client(params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()
            result = await session.call_tool(tool_name, args)
            return result.content[0].text

@mcp.tool()
async def get_user_data(user_id: int) -> str:
    """Fetch user data — delegates to the DB sub-agent."""
    return await call_sub_agent(
        "db_agent_server.py",
        "query_records",
        {"table": "users", "filters": {"id": user_id}}
    )

@mcp.tool()
async def notify_user(user_email: str, message: str) -> str:
    """Notify a user — delegates to the comms sub-agent."""
    return await call_sub_agent(
        "comms_agent_server.py",
        "send_email",
        {"to": user_email, "subject": "Notification", "body": message}
    )

if __name__ == "__main__":
    mcp.run()

graph TD
    A[Claude / Orchestrator Agent] --> B[Orchestrator MCP Server]
    B -->|query_records, insert_record| C[DB Agent Server]
    B -->|send_slack_message, send_email| D[Comms Agent Server]
    B -->|read_file, write_file| E[Storage Agent Server]

📌 Pro Tip: Run each sub-agent as a separate process. The orchestrator spins them up on demand. Claude only ever sees the orchestrator's 4–5 tools in its context, not the 30 underneath.

🛠 Pattern 2: Skill-as-Tool (Lazy Loading)

The idea: Instead of loading all 30 tool schemas upfront, expose one discover_skill tool. The agent calls it first to find out what's available, then makes the real call. Anthropic calls this "progressive discovery."

The agent's context stays lean until it actually needs depth. Think of it like a lazy import — don't load what you don't need yet.

The Code

from mcp.server.fastmcp import FastMCP
from typing import Literal
import json

mcp = FastMCP("lazy-skill-server")

# --- Skill Registry ---
SKILL_REGISTRY = {
    "incident_management": {
        "description": "Create, update, escalate and resolve incidents",
        "tools": ["create_incident", "update_incident", "escalate_incident", "resolve_incident"],
        "example": "discover_skill('incident_management') then call create_incident(...)"
    },
    "user_management": {
        "description": "Create, suspend, or look up user accounts",
        "tools": ["create_user", "suspend_user", "get_user_profile"],
        "example": "discover_skill('user_management') then call get_user_profile(...)"
    },
    "reporting": {
        "description": "Generate and export operational reports",
        "tools": ["generate_report", "export_csv", "email_report"],
        "example": "discover_skill('reporting') then call generate_report(...)"
    }
}

@mcp.tool()
def list_skills() -> str:
    """
    List all available skill domains. Call this first to discover
    what capabilities are available before making deeper calls.
    Returns skill names and short descriptions only — no schema bloat.
    """
    summary = {
        name: info["description"]
        for name, info in SKILL_REGISTRY.items()
    }
    return json.dumps(summary, indent=2)

@mcp.tool()
def discover_skill(skill_name: str) -> str:
    """
    Get the full details and available tools for a specific skill.
    Call list_skills() first, then call this to expand the skill you need.

    Args:
        skill_name: One of the skill names returned by list_skills()
    """
    if skill_name not in SKILL_REGISTRY:
        available = list(SKILL_REGISTRY.keys())
        return f"Unknown skill '{skill_name}'. Available: {available}"

    skill = SKILL_REGISTRY[skill_name]
    return json.dumps({
        "skill": skill_name,
        "description": skill["description"],
        "tools": skill["tools"],
        "usage_example": skill["example"]
    }, indent=2)

# --- Real tools loaded only when the agent asks ---

@mcp.tool()
def create_incident(title: str, severity: Literal["low", "medium", "high", "critical"],
                    description: str) -> str:
    """Create a new incident. Discovered via discover_skill('incident_management')."""
    incident_id = f"INC-{hash(title) % 10000:04d}"
    print(f"[Incident Created] {incident_id}: {title} [{severity}]")
    return json.dumps({"id": incident_id, "title": title, "severity": severity, "status": "open"})

@mcp.tool()
def resolve_incident(incident_id: str, resolution_note: str) -> str:
    """Resolve an incident by ID. Discovered via discover_skill('incident_management')."""
    return json.dumps({"id": incident_id, "status": "resolved", "note": resolution_note})

@mcp.tool()
def get_user_profile(user_id: str) -> str:
    """Fetch user profile. Discovered via discover_skill('user_management')."""
    return json.dumps({"id": user_id, "name": "Jane Doe", "role": "engineer", "active": True})

if __name__ == "__main__":
    mcp.run()

The agent's discovery flow looks like this:

sequenceDiagram
    participant Agent as Claude Agent
    participant MCP as Skill MCP Server

    Agent->>MCP: list_skills()
    MCP-->>Agent: {incident_management, user_management, reporting}

    Note over Agent: Only 2 tools loaded in context so far

    Agent->>MCP: discover_skill("incident_management")
    MCP-->>Agent: {tools: [create_incident, update_incident, ...]}

    Agent->>MCP: create_incident(title=..., severity=...)
    MCP-->>Agent: {id: "INC-0042", status: "open"}

📌 Pro Tip: The list_skills() tool returns only names + descriptions — no JSON schemas. Schemas only enter the context when the agent explicitly calls discover_skill(). This keeps your initial context tiny.

🎯 Pattern 3: Parameterized Tool Consolidation

The idea: Instead of one tool per action, build one tool that handles all related actions via an action parameter. Fewer tools = less schema noise in context.

Before:

create_ticket()   # 1 tool
update_ticket()   # 1 tool
close_ticket()    # 1 tool
reopen_ticket()   # 1 tool
get_ticket()      # 1 tool
# = 5 tools, 5 schemas loaded

After:

manage_ticket(action="create"|"update"|"close"|"reopen"|"get")
# = 1 tool, 1 schema loaded

The Code

from mcp.server.fastmcp import FastMCP
from typing import Literal, Optional
import json
import uuid

mcp = FastMCP("consolidated-tools-server")

# In-memory store for demo purposes
TICKETS: dict = {}
ALERTS: dict = {}

@mcp.tool()
def manage_ticket(
    action: Literal["create", "update", "close", "reopen", "get"],
    ticket_id: Optional[str] = None,
    title: Optional[str] = None,
    description: Optional[str] = None,
    priority: Optional[Literal["low", "medium", "high"]] = None,
    status_note: Optional[str] = None
) -> str:
    """
    Manage support tickets. Use the 'action' parameter to specify the operation.

    Actions:
      - create: Requires title, description, priority
      - update: Requires ticket_id; optionally title, description, priority
      - close:  Requires ticket_id; optionally status_note
      - reopen: Requires ticket_id
      - get:    Requires ticket_id
    """
    if action == "create":
        if not all([title, description, priority]):
            return "Error: create requires title, description, and priority"
        tid = f"TKT-{str(uuid.uuid4())[:8].upper()}"
        TICKETS[tid] = {
            "id": tid, "title": title, "description": description,
            "priority": priority, "status": "open"
        }
        return json.dumps({"created": TICKETS[tid]})

    if not ticket_id:
        return f"Error: action '{action}' requires ticket_id"
    if ticket_id not in TICKETS:
        return f"Error: ticket {ticket_id} not found"

    ticket = TICKETS[ticket_id]

    if action == "get":
        return json.dumps(ticket)

    if action == "update":
        if title:       ticket["title"] = title
        if description: ticket["description"] = description
        if priority:    ticket["priority"] = priority
        return json.dumps({"updated": ticket})

    if action == "close":
        ticket["status"] = "closed"
        if status_note: ticket["resolution"] = status_note
        return json.dumps({"closed": ticket})

    if action == "reopen":
        ticket["status"] = "open"
        ticket.pop("resolution", None)
        return json.dumps({"reopened": ticket})

    return f"Unknown action: {action}"


@mcp.tool()
def manage_alert(
    action: Literal["acknowledge", "silence", "escalate", "get", "list"],
    alert_id: Optional[str] = None,
    silence_duration_minutes: Optional[int] = None,
    escalate_to: Optional[str] = None
) -> str:
    """
    Manage monitoring alerts.

    Actions:
      - acknowledge: Requires alert_id
      - silence:     Requires alert_id, silence_duration_minutes
      - escalate:    Requires alert_id, escalate_to (team or person)
      - get:         Requires alert_id
      - list:        No extra params needed
    """
    if action == "list":
        return json.dumps(list(ALERTS.values())) if ALERTS else "[]"

    if not alert_id:
        return f"Error: action '{action}' requires alert_id"

    # Seed a demo alert if needed
    if alert_id not in ALERTS:
        ALERTS[alert_id] = {"id": alert_id, "status": "firing", "severity": "high"}

    alert = ALERTS[alert_id]

    if action == "get":
        return json.dumps(alert)

    if action == "acknowledge":
        alert["status"] = "acknowledged"
        return json.dumps({"acknowledged": alert})

    if action == "silence":
        if not silence_duration_minutes:
            return "Error: silence requires silence_duration_minutes"
        alert["status"] = "silenced"
        alert["silenced_for_minutes"] = silence_duration_minutes
        return json.dumps({"silenced": alert})

    if action == "escalate":
        if not escalate_to:
            return "Error: escalate requires escalate_to"
        alert["escalated_to"] = escalate_to
        alert["status"] = "escalated"
        return json.dumps({"escalated": alert})

    return f"Unknown action: {action}"


if __name__ == "__main__":
    mcp.run()

Now instead of loading 8 separate ticket + alert tool schemas, the agent loads 2. Less schema surface area, cleaner decision-making.

📌 Pro Tip: Group by resource, not action. If it's all about the same entity (ticket, alert, user), it belongs in one parameterized tool. Only split tools when the domains are genuinely different.

Putting It All Together

Here's how the three patterns layer on top of each other:

graph TD
    A[Claude Agent] --> B[Orchestrator MCP Server]

    B --> C["🎯 Consolidated Tools<br/>manage_ticket()<br/>manage_alert()"]
    B --> D["🛠 Lazy Skill Server<br/>list_skills()<br/>discover_skill()"]
    B --> E["🔧 Sub-Agent: DB<br/>query / insert / delete"]
    B --> F["🔧 Sub-Agent: Comms<br/>email / slack"]

    style A fill:#4A90D9,color:#fff
    style B fill:#7B68EE,color:#fff
    style C fill:#2ECC71,color:#fff
    style D fill:#F39C12,color:#fff
    style E fill:#E74C3C,color:#fff
    style F fill:#1ABC9C,color:#fff

Sub-Agent Grouping handles domain separation — each agent is a specialist
Lazy Loading handles discovery — context stays minimal until needed
Consolidation handles action sprawl — fewer tools for the same functionality

✅ Key Takeaways

✅ More tools = more context tokens = worse agent decisions — it's not just intuition, it's empirically measured
✅ Sub-agent grouping gives each agent a focused toolset (5–8 tools max)
✅ Lazy loading keeps initial context lean; schemas only appear when the agent needs them
✅ Parameterized consolidation cuts tool count by 60–70% without losing capability
✅ MCP tool design is an engineering discipline — the same principles as good API design apply

The core mental model: you don't expose every database table directly to an API. Don't expose every capability directly to an agent.

Structure beats surface area. Every time.

👋 Let's Connect

If you're building AI agents with MCP and want to compare notes — or ran into patterns I didn't cover — I'd love to hear about it.

🔗 Connect with me on LinkedIn
📧 Drop me a message

Let's build agents that actually make good decisions. 🚀

DEV Community

Stop Loading 30 MCP Tools Into One Agent — 3 Design Patterns That Actually Work

Prerequisites

🔧 Pattern 1: Sub-Agent Grouping

The Code

🛠 Pattern 2: Skill-as-Tool (Lazy Loading)

The Code

🎯 Pattern 3: Parameterized Tool Consolidation

The Code

Putting It All Together

✅ Key Takeaways

👋 Let's Connect

Top comments (0)