DEV Community

Charlie Li
Charlie Li

Posted on

How Claude Code Turns Any MCP Server Into a First-Class Tool (5-State Machine, 7 Transports, 27-Hour Timeouts)

Model Context Protocol (MCP) is the hot new standard for connecting AI agents to external tools. But most implementations treat it as a simple "connect and call" wrapper.

Claude Code does something far more interesting. After reverse-engineering its source code (v2.1.88, ~512K lines of TypeScript), I found an MCP integration layer that handles five connection states, seven transport mechanisms, session expiration detection, and a timeout default that made me double-check the math.

Here's how it actually works.


1. Five Connection States, Not Two

Most MCP clients track connections as binary: connected or not. Claude Code implements a five-state machine:

ConnectedMCPServer   -> Operational, tools available
FailedMCPServer      -> Connection failed  
NeedsAuthMCPServer   -> OAuth/auth required
PendingMCPServer     -> Reconnecting
DisabledMCPServer    -> User manually disabled
Enter fullscreen mode Exit fullscreen mode

Each state determines:

  • Whether tools appear in the LLM prompt
  • UI rendering behavior
  • Retry strategy

The NeedsAuth state is particularly clever. When a server returns 401, Claude Code doesn't just fail -- it transitions to needs-auth and provides a special McpAuthTool that lets the user trigger re-authentication without restarting the session.

Why this matters: In production, MCP connections fail in different ways. A server that needs OAuth re-auth shouldn't retry the same way as one with a network error. The five-state design lets each failure mode get targeted recovery.


2. Seven Transport Mechanisms

MCP is transport-agnostic by spec. Claude Code implements all seven:

Local:

  • stdio -- Standard I/O pipes (most common, zero config)
  • sdk -- In-process with zero serialization overhead (Agent SDK mode)

Remote:

  • sse -- Server-Sent Events + HTTP POST
  • http -- Streamable HTTP (MCP spec recommended)
  • ws -- WebSocket (full-duplex)

IDE-Specific:

  • sse-ide -- IDE extension variant (Windows compat)
  • ws-ide -- IDE extension variant

The sdk transport is interesting -- when running inside the Claude Agent SDK, MCP tools call directly into the process with no serialization. This is why Claude Code feels faster in SDK mode for MCP-heavy workflows.


3. The Template Override Pattern (Instead of Inheritance)

Here's how Claude Code bridges MCP tools into its internal tool system:

export const MCPTool = buildTool({
  isMcp: true,
  name: 'mcp',  // Overridden at runtime
  maxResultSizeChars: 100_000,  // 100KB hard limit

  get inputSchema() {
    return z.object({}).passthrough()  // Accept ANY input
  },

  async checkPermissions() {
    return { behavior: 'passthrough' }  // Defer to invocation time
  },
})
Enter fullscreen mode Exit fullscreen mode

Instead of creating a class hierarchy for each MCP tool, Claude Code uses object spread + method override to stamp out lightweight tool wrappers. Each connected server's tools become:

mcp__{server}__{tool}
Enter fullscreen mode Exit fullscreen mode

For example: mcp__github__create_issue, mcp__postgres__query.

The double-underscore namespacing prevents collisions between tools from different servers that might share names.


4. The 27.8-Hour Timeout (Not a Bug)

// Default MCP tool timeout
const MCP_TOOL_TIMEOUT = 100_000_000  // ~27.8 hours
Enter fullscreen mode Exit fullscreen mode

I initially thought this was a mistake. It's not.

The reasoning: MCP tools might trigger long-running operations -- compiling a large project, running a test suite, processing a dataset. Claude Code's philosophy is: don't timeout on behalf of the user. If someone's MCP tool takes 20 minutes, that's their tool's business.

You can override this with the MCP_TOOL_TIMEOUT environment variable. But the default says: we trust MCP servers to manage their own lifetimes.

Compare this to the 15-second timeout they use for their own API calls. The asymmetry is intentional -- they control their API behavior, but they can't predict MCP server behavior.


5. Session Expiration: Why Two Signals Are Required

MCP defines a session expiration signal, but Claude Code adds a safety check that's cleverly paranoid:

function isMcpSessionExpiredError(error: Error): boolean {
  const httpStatus = error.code
  if (httpStatus !== 404) return false

  return (
    error.message.includes('"code":-32001') ||
    error.message.includes('"code": -32001')
  )
}
Enter fullscreen mode Exit fullscreen mode

It requires both HTTP 404 and JSON-RPC error code -32001 before treating a session as expired.

Why not just 404? Too common -- wrong URL, server down, routing error. Why not just -32001? No standard meaning outside this context.

Only together do they mean "this specific session no longer exists on this server." It's a pattern I now use in my own code: require two independent signals before taking irreversible action.


6. The 15-Minute Auth Cache

When an MCP server rejects authentication, Claude Code caches the failure:

const MCP_AUTH_CACHE_TTL_MS = 15 * 60 * 1000 // 15 minutes
Enter fullscreen mode Exit fullscreen mode

For the next 15 minutes, it skips connection attempts entirely and marks the server as needs-auth. This prevents:

  • Hammering a server with doomed connection attempts
  • Slowing down the agent loop with repeated OAuth failures
  • Confusing the user with repeated error messages

After 15 minutes, it tries again -- in case the user re-authenticated in another tab or the server rotated tokens.


7. LRU Tool Caching (20 Server Capacity)

When a server connects, Claude Code fetches its tool list:

// Fetch tools with Unicode sanitization (security!)
const result = await client.request(
  { method: 'tools/list' },
  ListToolsResultSchema,
)
const tools = recursivelySanitizeUnicode(result.tools)
Enter fullscreen mode Exit fullscreen mode

The recursivelySanitizeUnicode call is a security measure -- MCP servers could inject invisible Unicode characters into tool descriptions that confuse the LLM. Claude Code strips these before they reach the prompt.

Tool lists are cached in an LRU with capacity for 20 servers. Beyond that, the least-recently-used server's tools get evicted and re-fetched on next access.


8. MCP Instructions Break the Prompt Cache

Claude Code uses an aggressive caching strategy for its system prompt -- everything is cached across turns to save tokens. Except MCP instructions:

Static Zone (cached):
  Identity, rules, tool guides...

Dynamic Zone (cache-breaking):
  Memory content
  Environment info  
  mcp_instructions  <-- Uses DANGEROUS_uncachedSystemPromptSection
Enter fullscreen mode Exit fullscreen mode

The DANGEROUS_ prefix isn't decoration -- it's a naming convention that forces developers to acknowledge they're breaking the cache. MCP instructions get this treatment because servers can connect/disconnect mid-conversation. If they were cached, the LLM might reference tools that no longer exist.

Token cost: Every MCP instruction change invalidates the cache for that section. This is why having too many MCP servers with long instructions can noticeably slow things down.


The Pattern Worth Stealing

The architectural insight that ties all of this together: Claude Code treats MCP as an untrusted, unreliable external dependency -- and builds accordingly.

  • Five states for five types of failure
  • Two signals required for session expiration
  • Unicode sanitization on tool descriptions
  • Auth failure caching to prevent retry storms
  • 100KB result size limits
  • Passthrough permissions (checked at call time, not registration)

This is what production-grade MCP integration looks like. Not "connect and call," but "connect, verify, cache, sanitize, isolate, and gracefully degrade."


These patterns come from Chapter 10 of my book, where I cover the full MCP + plugin architecture across 40+ pages with code examples.

Get the complete deep dive:

12 chapters covering the full architecture -- from the core agent loop to the permission system to MCP integration.


These patterns come from my deep-dive into Claude Code's actual source code (v2.1.88). I wrote 12 chapters covering the complete architecture — from the core loop to multi-agent coordination.

📖 Read Chapter 1 free — "What Is an AI Agent? From ChatBot to Claude Code"

If you like it, the full book is available with 50%% off for early readers:

📘 Claude Code from the Inside Out (English) — use code LAUNCH50 for $4.99

📕 深入浅出 Claude Code (中文) — use code LAUNCH50CN for $4.99

Previously in this series:

Top comments (0)