The Daily Agent

Posted on May 4

Building Custom MCP Servers: A Developer's Guide to Production-Grade AI Agent Tools

#mcp #ai #webdev #tutorial

The Model Context Protocol (MCP) has become the default standard for connecting AI agents to external tools and APIs. Governed by the Linux Foundation since early 2025 and adopted by OpenAI, Anthropic, Microsoft, and Vercel, MCP is the USB-C port of the AI ecosystem — one protocol that lets any LLM application talk to any tool server.

But there's a gap between reading the spec and building something that works reliably in production. I've spent the last few months building MCP servers for production agent workflows, and this guide captures the patterns that actually matter.

If you've read the "6 Agent Gateway Platforms" roundups, you know which MCP servers to consume. This is the guide for when you need to build one yourself.

What We're Building

By the end of this guide, you'll have built a production-ready MCP server that:

Exposes typed tools with JSON Schema validation
Uses Streamable HTTP transport (the 2026 recommended standard)
Handles errors gracefully with structured responses
Includes proper authentication for sensitive operations
Is testable with the MCP Inspector

Let's start with the foundation.

Architecture: The Three MCP Building Blocks

Before writing code, understand what your server can expose. MCP defines three primitives:

Feature	What It Does	Who Controls It
Tools	Functions the AI model calls (write, compute, act)	Model decides when to invoke
Resources	Read-only data (files, DB schemas, API docs)	Application retrieves and provides
Prompts	Pre-built templates for common workflows	User triggers explicitly

For a tool server — the most common production pattern — you'll focus on tools. Resources and prompts are optional but useful for providing context and guiding the model's behavior.

Setting Up a TypeScript MCP Server

The official TypeScript SDK is the most widely adopted way to build MCP servers. It's what Claude Desktop, Cursor, and Windsurf use internally.

// server.ts
import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import {
  CallToolRequestSchema,
  ListToolsRequestSchema,
} from "@modelcontextprotocol/sdk/types.js";
import { z } from "zod";

// Define a tool with Zod validation
const CodeReviewInput = z.object({
  repoPath: z.string().min(1, "Repository path is required"),
  prNumber: z.number().int().positive("PR number must be positive"),
  strictness: z.enum(["basic", "standard", "deep"]).default("standard"),
});

type CodeReviewInput = z.infer<typeof CodeReviewInput>;

// Server instance
const server = new Server(
  {
    name: "code-review-mcp",
    version: "1.0.0",
  },
  {
    capabilities: {
      tools: {},
    },
  }
);

// Tool registration
server.setRequestHandler(ListToolsRequestSchema, async () => ({
  tools: [
    {
      name: "review_pull_request",
      description:
        "Perform a code review on a pull request at the given path with configurable strictness",
      inputSchema: {
        type: "object",
        properties: {
          repoPath: {
            type: "string",
            description: "Absolute path to the local repository",
          },
          prNumber: {
            type: "number",
            description: "Pull request number to review",
          },
          strictness: {
            type: "string",
            enum: ["basic", "standard", "deep"],
            description: "How thorough the review should be",
          },
        },
        required: ["repoPath", "prNumber"],
      },
    },
  ],
}));

server.setRequestHandler(CallToolRequestSchema, async (request) => {
  if (request.params.name === "review_pull_request") {
    const args = CodeReviewInput.parse(request.params.arguments);

    try {
      const review = await performReview(args.repoPath, args.prNumber, args.strictness);
      return {
        content: [
          {
            type: "text",
            text: JSON.stringify(review, null, 2),
          },
        ],
      };
    } catch (error) {
      return {
        content: [
          {
            type: "text",
            text: `Review failed: ${error instanceof Error ? error.message : "Unknown error"}`,
          },
        ],
        isError: true,
      };
    }
  }

  throw new Error(`Unknown tool: ${request.params.name}`);
});

// Start with stdio transport (for local development)
const transport = new StdioServerTransport();
await server.connect(transport);

This is the skeleton. Every MCP server follows this pattern: declare capabilities, define tool schemas, implement handlers, connect a transport.

Writing Tools That Agents Actually Use Well

The biggest mistake I see in MCP server designs is writing tools the way you'd write REST endpoints for other developers. Agents don't read documentation the way humans do. Your tool names, descriptions, and schemas need to be optimized for an LLM to discover and use correctly.

Naming Conventions

Use descriptive, action-oriented names:

Good: search_codebase, create_jira_ticket, deploy_to_staging
Bad: exec, run, helper, util

Descriptions That Work

Your tool description is the agent's documentation. Be explicit about when to use it, what it does, and edge cases.

{
  name: "deploy_service",
  description:
    "Deploy a service to the staging environment. Use when the user asks to deploy, push to staging, or test a deployment. Does NOT deploy to production — use deploy_to_production for that. Requires the service to have passed CI checks.",
}

Input Schema Design

Keep required parameters minimal. Agents get confused by complex schemas with many required fields. Use sensible defaults wherever possible.

{
  inputSchema: {
    type: "object",
    properties: {
      serviceName: {
        type: "string",
        description: "Name of the service to deploy (e.g., 'api-gateway', 'worker')",
      },
      version: {
        type: "string",
        description: "Semantic version to deploy. If omitted, uses the latest built version.",
      },
      region: {
        type: "string",
        enum: ["us-west-2", "us-east-1", "eu-west-1"],
        description: "Target region. Defaults to us-west-2.",
      },
    },
    required: ["serviceName"],
  },
}

One required field, optional parameters with clear defaults. The agent can succeed with minimal information and ask for more when needed.

Streamable HTTP: The Production Transport

Stdio transport is fine for local development (Claude Desktop, VS Code), but for production deployments you need HTTP. In 2026, Streamable HTTP has replaced Server-Sent Events (SSE) as the recommended standard.

// http-server.ts
import express from "express";
import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js";

const app = express();
app.use(express.json());

const mcpServer = new Server(
  { name: "production-mcp", version: "2.0.0" },
  { capabilities: { tools: {} } }
);

// Register tools (same as before)
mcpServer.setRequestHandler(ListToolsRequestSchema, async () => ({
  tools: [
    // ... tool definitions
  ],
}));

// HTTP endpoint
app.post("/mcp", async (req, res) => {
  const transport = new StreamableHTTPServerTransport({
    sessionId: req.headers["mcp-session-id"] as string | undefined,
  });

  transport.onerror = (error) => {
    console.error("Transport error:", error);
  };

  await transport.handleRequest(req.body, req.headers, res);

  if (transport.sessionId) {
    res.setHeader("mcp-session-id", transport.sessionId);
  }
});

app.listen(3000, () => {
  console.log("MCP server listening on port 3000");
});

The key advantage of Streamable HTTP over SSE is that connections are short-lived and stateless. Each request-response pair is independent, making it trivial to deploy behind load balancers and auto-scaling groups.

Testing with MCP Inspector

The MCP Inspector is the Postman of the MCP world. Run it against your server during development to validate your tool schemas and responses before any agent touches them:

npx @modelcontextprotocol/inspector node dist/server.js

For HTTP servers:

npx @modelcontextprotocol/inspector http://localhost:3000/mcp

This gives you a web UI where you can browse available tools and their schemas, execute tools with custom parameters, inspect raw JSON-RPC messages, and validate error handling paths.

Always run your tools through the Inspector before deploying. I've caught more schema bugs in the Inspector than in actual agent conversations.

Production Security Patterns

MCP's security model is intentionally permissive at the protocol level — the host application implements the guardrails.

Tool-Level Approval Gates

For sensitive operations, add an approval layer:

const sensitiveTools = ["delete_resource", "modify_production_config"];

server.setRequestHandler(CallToolRequestSchema, async (request) => {
  if (sensitiveTools.includes(request.params.name)) {
    return {
      content: [
        {
          type: "text",
          text: `This operation requires approval. Please confirm by calling approve_operation with session ID.`,
        },
      ],
      isError: false,
    };
  }

  // Normal tool handling...
});

Input Validation with Zod

Never trust the model's arguments. Even well-behaved agents can hallucinate parameter shapes:

const args = z
  .object({
    email: z.string().email(),
    template: z.string().min(1).max(100),
    variables: z.record(z.string()),
  })
  .strict()
  .parse(request.params.arguments);

Rate Limiting and Quotas

MCP servers don't have built-in rate limiting — add it yourself:

import { rateLimit } from "express-rate-limit";

const limiter = rateLimit({
  windowMs: 60 * 1000,
  max: 100,
});

app.use("/mcp", limiter);

Deployment Options

Approach	Best For	Transport
Local stdio	Development, personal tools	stdio
Docker + reverse proxy	Internal team tools	Streamable HTTP
Vercel (via @vercel/mcp-adapter)	Serverless, public endpoints	Streamable HTTP
Azure Container Apps	Enterprise, Microsoft ecosystem	Streamable HTTP
Kubernetes	Multi-region, high-scale	Streamable HTTP

The Full Picture: Where MCP Servers Fit

MCP servers are the tool layer in a larger agent architecture. You build servers that encapsulate specific capabilities — a Jira server, a GitHub server, a database query server — and an orchestrator like Nebula routes the right server to the right agent based on the user's intent.

Debugging Common Issues

Schema mismatches: Validate with Zod, return descriptive errors
Missing descriptions: Write descriptions that specify when (and when not) to use the tool
State assumptions: Make tools stateless — accept all needed context in arguments
Timeout failures: Return an operation ID immediately, provide a status-check tool

Key Takeaways

MCP is the standard for AI agent tooling in 2026
Build tools optimized for agents, not humans
Use Streamable HTTP for production
Always validate inputs with Zod
Test every tool with MCP Inspector
MCP servers are the tool layer; orchestrators like Nebula handle routing and state

Top comments (2)

Raju Dandigam • May 25

This is one of the more operationally realistic MCP articles I’ve seen lately. Most tutorials stop at “the tool call worked,” but production systems need visibility into retries, auth failures, malformed payloads, and ambiguous tool selection. I especially liked the emphasis on validation and inspection workflows because MCP infrastructure becomes very hard to debug once multiple tools and orchestration layers are involved. I’ve been exploring a similar local-first debugging layer in agent-inspect for TypeScript agents, mainly around execution traces and tool-call reconstruction before teams adopt hosted observability platforms. Great practical direction overall.

Some comments may only be visible to logged-in visitors. Sign in to view all comments.