Swrly

Posted on Apr 14 • Originally published at swrly.com

Model Context Protocol (MCP) Explained: Why It Matters in 2026

#ai

In 2024, every AI framework shipped its own way to call tools. LangChain had one approach. AutoGen had another. The Claude API had a third. If you wanted to write a GitHub tool that worked across all three, you wrote it three times — and maintained three versions.

That problem is solved now. Model Context Protocol (MCP) is an open standard from Anthropic that defines a single way for AI models to discover and call external tools. By 2026 it has been adopted across Claude Code, Cursor, GitHub Copilot, and is supported in OpenAI's tooling as well. It is the closest thing the AI tooling ecosystem has to a standard.

This post explains what MCP actually is, how it works under the hood, when you should use it, and when it is overkill. If you are building anything involving AI agents and external tools, this is worth understanding.

The Problem MCP Solves

Before MCP, every team building AI-powered tooling faced the same problem: there was no standard interface for exposing tools to language models.

Say you wanted an agent that could create a GitHub pull request. You would write a function that calls the GitHub API, wrap it in whatever tool-calling schema the model expected, and test it. Then someone asks for the same capability in a different framework. You write it again. Different schema, different transport, different error handling conventions — same underlying API call.

Multiply this across the entire ecosystem. GitHub tools, Slack tools, database tools, monitoring tools — each one reimplemented for each framework. The result was a fragmented landscape where the same work was being done hundreds of times, with no shared discovery mechanism, no standard error format, and no way to compose tools across tool providers.

MCP addresses all three of these. It defines:

How servers expose tools — a standard JSON schema format for describing tool inputs and outputs
How clients discover tools — a standard handshake where clients ask servers to list their available tools
How calls are made — a standard request/response format using JSON-RPC 2.0

Build a GitHub MCP server once. Any MCP client — Claude Code, Cursor, Swrly, your custom agent — can connect to it and use its tools without modification.

What MCP Actually Is

MCP is a client-server protocol built on JSON-RPC 2.0. The structure is simple:

MCP servers expose tools, resources, and prompts
MCP clients (your agent or IDE) connect to one or more servers, discover what is available, and call tools during model execution
Transports define the communication channel between client and server

The best analogy is the Language Server Protocol (LSP), which standardized how code editors communicate with language analysis tools. Before LSP, every editor (VS Code, Vim, Emacs) had to implement its own TypeScript integration, its own Python type checker integration, and so on. LSP moved that complexity to the server side. MCP does the same thing for AI tool integrations.

The Two Transport Modes

MCP supports two transport modes, and understanding both is important because they are not interchangeable:

SSE (Server-Sent Events) was the original transport. The client opens an HTTP connection, the server streams events back. This transport is required for SDK v1.x clients like Claude Code SDK v1 — those clients use the query() API and cannot establish bidirectional HTTP streams. If you are using @anthropic-ai/claude-code at version 1.x, you need an SSE endpoint.

HTTP Streamable is the newer transport introduced for SDK v2+. It uses standard HTTP POST requests with streaming responses. It supports authorization headers, which SSE traditionally cannot (the browser EventSource API does not allow setting request headers). This transport is better for production server-to-server communication.

In practice, production MCP servers often expose both endpoints — /sse for backward compatibility with v1 clients and /mcp for v2+ clients. SSE endpoints typically cannot enforce Authorization header checks because legacy clients cannot send them, so you need to handle authentication at the tool level instead.

How Tool Discovery Works

When a client connects to an MCP server, it makes a tools/list request. The server responds with an array of tool definitions:

{
  "tools": [
    {
      "name": "github_create_pr",
      "description": "Creates a pull request on GitHub",
      "inputSchema": {
        "type": "object",
        "properties": {
          "owner": { "type": "string", "description": "Repository owner" },
          "repo": { "type": "string", "description": "Repository name" },
          "title": { "type": "string", "description": "PR title" },
          "body": { "type": "string", "description": "PR description" },
          "head": { "type": "string", "description": "Branch to merge from" },
          "base": { "type": "string", "description": "Branch to merge into" }
        },
        "required": ["owner", "repo", "title", "head", "base"]
      }
    }
  ]
}

The model receives this schema and knows how to call the tool. When it decides to create a PR, it emits a tools/call request with a JSON object matching the inputSchema. The server executes the tool and returns the result. The model incorporates the result into its reasoning.

This is the core loop: discover, call, incorporate.

A Minimal MCP Server

Here is what a minimal MCP server looks like using the official TypeScript SDK:

import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";

const server = new McpServer({
  name: "github-tools",
  version: "1.0.0",
});

server.tool(
  "github_get_pr",
  "Fetch details of a pull request",
  {
    owner: z.string().describe("Repository owner"),
    repo: z.string().describe("Repository name"),
    pr_number: z.number().describe("Pull request number"),
  },
  async ({ owner, repo, pr_number }) => {
    const response = await fetch(
      `https://api.github.com/repos/${owner}/${repo}/pulls/${pr_number}`,
      {
        headers: {
          Authorization: `Bearer ${process.env.GITHUB_TOKEN}`,
          Accept: "application/vnd.github+json",
        },
      }
    );

    if (!response.ok) {
      throw new Error(`GitHub API error: ${response.status}`);
    }

    const pr = await response.json();
    return {
      content: [
        {
          type: "text",
          text: JSON.stringify({
            title: pr.title,
            body: pr.body,
            state: pr.state,
            draft: pr.draft,
            mergeable: pr.mergeable,
          }),
        },
      ],
    };
  }
);

const transport = new StdioServerTransport();
await server.connect(transport);

That is a complete, working MCP server. It exposes one tool. Any MCP client can connect to it via stdio and call github_get_pr. The Zod schema is automatically converted to the JSON Schema format that tools/list returns.

The HTTP transport version adds a few lines to set up an Express (or Hono) server and swap StdioServerTransport for SSEServerTransport or StreamableHTTPServerTransport, but the tool definition code is identical.

When MCP Is Worth It

MCP adds real value in specific situations.

You are building a platform or a reusable agent system. If multiple agents — or multiple teams — need access to the same tool, an MCP server is the right abstraction. You define the tool once, expose it over the network, and every agent connects to it. Updates to the tool propagate to all consumers without any redeployment of agent code.

You have a shared tool library. Teams that are building dozens of agent workflows benefit from an MCP server that centralizes all their integrations — GitHub, Slack, Jira, database queries. Individual agents connect to the shared server and use whatever subset of tools they need. This mirrors how a shared library works in traditional software but with the discovery mechanism built in.

Tools run in separate processes for isolation or security. MCP servers can run as separate processes, separate containers, or even separate machines. If your database tools need different credentials than your Slack tools, you can run them as separate MCP servers. The client connects to both and routes tool calls to the right server. This is significantly cleaner than embedding all your tool logic in a single agent process.

You want local-first agent setups. Running tools locally — against a local database, a local file system, a local development environment — is straightforward with stdio transport. The MCP server runs as a subprocess, the agent process spawns it, and they communicate over stdin/stdout. No network, no auth, no infrastructure.

When MCP Is Overkill

MCP adds complexity. For some use cases, that complexity does not pay off.

One-off agents with one or two tools. If you are building a script that calls the Stripe API and sends a Slack message, you do not need MCP. Write two functions, call them from your agent, move on. The overhead of running an MCP server, handling tool discovery, and managing the connection is not justified for two endpoints.

Rapid prototyping where schemas change frequently. MCP tool schemas are defined at server startup. Every time you change a tool's input or output, you need to restart the server and reconnect any clients. When you are iterating quickly on what a tool should do, this friction adds up. A direct function call is easier to change. Add MCP once the interface has stabilized.

Situations where the model needs to compose tools dynamically. MCP is designed around a static list of tools discovered at connection time. If you need to generate tools programmatically at runtime — for example, generating a unique tool per database table — MCP can handle it, but it requires careful design to avoid listing hundreds of tools in every tools/list response, which wastes context tokens.

Common Gotchas in 2026

MCP is straightforward once you understand these rough edges.

Transport mismatch causes silent failures. The most common issue when connecting an MCP client to a server is transport incompatibility. If your client uses SDK v1 and expects SSE, and your server only exposes an HTTP Streamable endpoint, the connection will fail — often with an unhelpful error. Always confirm which transport version your client SDK requires and check that the server exposes the correct endpoint. When in doubt, expose both.

Tool discovery cost scales with tool count. Every MCP client requests the full tools/list on connection. If your server exposes 300 tools, that list goes into the model's context on every connection. At GPT-4 or Claude Sonnet token rates, a 300-tool schema can cost 5,000-10,000 tokens per session before the model does any work. Two mitigations: run tool filtering so each agent only sees the tools it is allowed to use, or organize tools across multiple smaller MCP servers and only connect agents to the servers they need.

MCP server instances must be per-connection for concurrent agents. A common architecture mistake is using a singleton MCP server instance shared across all connections. This works fine for one agent at a time but causes "already connected" and state corruption errors when multiple agents run concurrently. The correct pattern is a factory function — each incoming SSE connection instantiates its own McpServer object, isolated from all others. The underlying tool implementations can share infrastructure (database connections, HTTP clients), but the MCP server object itself must be per-connection.

Authorization on SSE endpoints is limited. SSE relies on browser EventSource, which cannot set custom headers. This means you cannot use Authorization: Bearer <token> on SSE connections — your clients simply cannot send it. Production SSE servers typically authenticate at the tool level (passing credentials as tool arguments or reading them from environment variables) rather than at the transport level.

Tool name collisions across servers. When a client connects to multiple MCP servers, tool names from all servers appear in the same namespace. If two servers both define a tool called create_issue (one for Jira, one for Linear), you have a collision. Adopt a consistent naming convention upfront: jira_create_issue, linear_create_issue. This is especially important when building platforms where third-party MCP servers may be added over time.

The MCP Ecosystem Today

The MCP ecosystem has grown faster than most expected. By early 2026:

Over 10,000 public MCP servers are listed in community registries
Major IDE platforms — Cursor, VS Code with Copilot, JetBrains AI — all support MCP natively
Claude Code integrates MCP servers via the .claude/settings.json configuration file
OpenAI's tool calling format has aligned closely enough with MCP that adapters are trivial
AWS, Cloudflare, and Vercel all offer managed MCP server hosting

The adoption pattern followed the same arc as LSP: slow uptake in the first year while the tooling ecosystem matured, then rapid adoption once the major IDEs and frameworks committed to the standard. The key difference from previous AI tool standards is that MCP is genuinely simple. The spec fits in a single document. A working server is under 50 lines of TypeScript. That simplicity is not an accident — it is why it won.

How Swrly Uses MCP

Swrly is built on MCP end-to-end. Every integration — GitHub, Slack, Linear, Jira, PagerDuty, Stripe, and 45 others — is exposed as a set of MCP tools through a single server at port 3002. When an agent workflow runs, the runner service connects to the MCP server and passes the available tools to the Claude Code SDK. The model sees the tools, decides which ones to call, and the MCP server executes them.

The platform currently exposes 346 tools across 51 connectors. Presenting all 346 tools to every agent would be wasteful and would dilute the model's attention. Instead, each agent node in a workflow has a selectedTools configuration — an explicit list of the MCP tool names that agent is allowed to use. The runner filters the tools/list response to only include those tools before passing it to the model.

This per-agent tool restriction serves two purposes. First, it keeps context lean — an agent that only needs github_list_pr_files and github_get_content does not need to see the 50 Stripe tools. Second, it enforces least-privilege access — a code review agent cannot call stripe_create_refund even if the model tries.

The MCP server runs as a separate Docker container from the main web app and the runner service. Tools that need to make outbound HTTP requests include SSRF protection — all URLs are checked against a blocklist before the request is made. Credentials (API keys, OAuth tokens) are stored encrypted in the database, decrypted at the MCP server level, and never passed through the agent's context.

Conclusion

MCP won because it solved a real problem simply. The AI tool integration ecosystem was fragmented, duplicative, and incompatible across frameworks. MCP gave it a common language — not a complicated enterprise spec, but a thin JSON-RPC protocol with a clear handshake and a standard schema format.

If you are building a one-off agent script, you can ignore MCP entirely. Write functions, call them directly, ship it. But if you are building agent infrastructure — a platform, a shared tool library, a multi-team automation system — MCP is the right foundation. The ecosystem is there, the tooling is mature, and the alternative (custom integration code per framework) is worse.

Understanding the transport modes, the tool discovery cost at scale, and the per-connection instance requirement will save you real debugging time. The gotchas are not subtle once you know them.

For a working reference implementation, the Swrly MCP server demonstrates the SSE + HTTP Streamable dual-transport pattern, per-connection instance isolation, and per-agent tool filtering at production scale.

Top comments (1)

Archit Mittal • Apr 14

Good explainer on MCP. The part that's most exciting to me as someone building automation workflows is how MCP standardizes tool interfaces across different AI providers. Before MCP, every integration was bespoke - you'd write one tool calling format for Claude, another for GPT, another for open-source models. Now with MCP servers, you write the integration once and any compatible client can use it. The ecosystem hitting 97M installs recently shows this isn't just hype anymore. The next frontier is MCP server discovery and composition - being able to chain multiple MCP servers together into complex workflows without manual wiring.