Gursharan Singh

Posted on Mar 25 • Edited on Jun 12

MCP in Practice — Part 3: How MCP Works — The Complete Request Flow

#mcp #ai #systemdesign #llm

This article is Part 3 of my MCP in Practice series, where I explain the Model Context Protocol in practical, production-oriented terms.

In this part, we walk through the complete MCP request flow — from client to server and back — and what happens at each step.

At its core, MCP is a structured conversation between an AI and external systems. The AI asks what is available. The system responds in a format both sides understand. The AI requests what it needs. The system returns the result.

That is the mental model for the rest of this article.

Part 2 explained what MCP is: the components (Host, Client, Server), the three primitives (Tools, Resources, Prompts), and the control planes that govern them. This article shows how those pieces actually interact — first as a system map, then as message flow, and finally as wire-level protocol messages.

The End-to-End Request Flow

Once the pieces are in place, this is what happens when a user asks a question that requires an external system.

The diagram numbers each message individually. The six steps below group those messages into higher-level phases:

User -> Host. A customer asks: "What is the status of order #4521?"
Host -> LLM. The Host passes the question to the language model, along with context.
LLM -> Host. The model decides it needs the check_order_status capability. It does not call the tool itself — it tells the Host what to call and with what arguments.
Host -> MCP Client -> MCP Server. The Host routes the request through the appropriate MCP Client, which sends a JSON-RPC request to the MCP Server that wraps the order database.
MCP Server -> Real System -> MCP Server. The Server translates the request into a native database query or API call, retrieves the result, and formats it back into the MCP response structure.
MCP Server -> MCP Client -> Host -> LLM -> User. The response travels back up the chain. The LLM uses the result to compose a natural-language answer: "Order #4521 shipped yesterday and is expected to arrive Thursday."

The key insight is separation of concerns. The LLM never touches the database. The Server never reasons about language. The Client never decides which tool to use. Each layer does one thing, and the protocol keeps them coordinated.

JSON-RPC Basics - Just Enough to Read the Wire

Every MCP message is JSON-RPC 2.0. You only need three message types to read the rest of this article:

Request - has an id and expects a response.
Response - uses the same id and carries the result or an error.
Notification - has no id and expects no response.

That is enough to follow every example below.

MCP Protocol Lifecycle

No tool gets called until the Client and Server have agreed on what each side can do. This negotiation happens once, at the start of every connection.

The lifecycle is easiest to think about in three phases: initialize, discover/invoke, and notify.

Phase 1: Initialize

The Client sends the first message. It declares its protocol version, identifies itself, and announces what it can handle.

Initialize request:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "initialize",
  "params": {
    "protocolVersion": "2025-06-18",
    "capabilities": {
      "roots": { "listChanged": true },
      "sampling": {}
    },
    "clientInfo": {
      "name": "ecommerce-ai-assistant",
      "version": "1.2.0"
    }
  }
}

Two things to notice. First, this is standard JSON-RPC 2.0 — MCP does not invent its own message format. Second, the capabilities object is not decorative. It is a contract. The Client is saying: "I can handle root list changes, and I support sampling requests from the server." Sampling is a client-side MCP feature that lets a server ask the host's model for a completion; this article stays focused on the server-side flow.

Phase 2: Negotiate

The Server responds with its own capabilities.

Initialize response:

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "protocolVersion": "2025-06-18",
    "capabilities": {
      "tools": { "listChanged": true },
      "resources": { "subscribe": true }
    },
    "serverInfo": {
      "name": "order-database-server",
      "version": "2.0.1"
    }
  }
}

The Server declares that it offers tools (and that the tool list can change at runtime), and that it supports resource subscriptions. If a capability is not declared here, it does not exist for this connection.

This is capability negotiation. Intentions are declared upfront, not assumed at runtime.

Phase 3: Ready

The Client sends an initialized notification to confirm the handshake is complete.

{
  "jsonrpc": "2.0",
  "method": "notifications/initialized"
}

Notice: there is no id field. This is a notification, not a request — it does not expect a response. The connection is now ready. Tools can be discovered. Requests can flow.

Most integration failures happen because one side assumes capabilities the other does not have. MCP prevents this by making both sides declare their intentions before any work begins.

Tool Discovery and Execution

With the connection established, the Client can now ask the Server what tools it offers. This is a two-step pattern: discover, then call.

Step 1 — Discover: `tools/list`

The full tools/list request is minimal — the Client is simply asking what the Server currently exposes:

{
  "jsonrpc": "2.0",
  "id": 2,
  "method": "tools/list"
}

In production, tools/list also supports pagination via cursors for servers exposing many tools.

The Server responds with tool definitions that include a name, title, description, and input schema. Here is a simplified example from an ecommerce order server:

{
  "name": "check_order_status",
  "title": "Check Order Status",
  "description": "Retrieve the current status, shipping details, and estimated delivery date for a customer order. Use this when a customer asks about their order, tracking information, or delivery timeline.",
  "inputSchema": {
    "type": "object",
    "properties": {
      "order_id": {
        "type": "string",
        "description": "The order identifier"
      }
    },
    "required": ["order_id"]
  }
}

Read that description carefully. It is not a label — it is the LLM’s decision interface. The model uses this text to decide when to call the tool, what to pass to it, and why it applies to the current situation.

This is one of the most common sources of problems in MCP deployments. The tool works perfectly. The code is correct. But the model never calls it — because the description was not written for an LLM to reason about.

The latest MCP spec also supports outputSchema for tools, which is useful when you want typed, validated tool outputs in production.

Step 2 — Execute: `tools/call`

The LLM has decided it needs the tool. The Client sends a tools/call request with the tool name and arguments. The Server executes the underlying database query and responds with structured content the LLM can reason about directly.

tools/call request:

{
  "jsonrpc": "2.0",
  "id": 3,
  "method": "tools/call",
  "params": {
    "name": "check_order_status",
    "arguments": {
      "order_id": "4521"
    }
  }
}

tools/call response:

{
  "jsonrpc": "2.0",
  "id": 3,
  "result": {
    "content": [
      {
        "type": "text",
        "text": "Order 4521: shipped 2026-03-22, estimated delivery 2026-03-26"
      }
    ],
    "isError": false
  }
}

Newer MCP tool results can also include structuredContent alongside text, which helps when a client needs typed, machine-friendly data in addition to human-readable output.

The response travels back up the chain to the Host, which passes it to the LLM. The model composes the natural-language reply the user actually sees. The Server’s job ends the moment it returns structured data — reasoning about what to say is not its concern.

Notifications and Dynamic Systems

Static tool lists are fine for a demo. In production, things change. A new tool gets deployed. A resource gets updated. A server's capabilities shift based on time, configuration, or the connected user's permissions.

MCP handles this through notifications — one-way messages from the Server that tell the Client something has changed without requiring a request.

When a Server deploys a new tool while a connection is active, it sends:

{
  "jsonrpc": "2.0",
  "method": "notifications/tools/list_changed"
}

The Client reacts by calling tools/list again to get the updated list. It does not have to disconnect, re-initialize, or poll on a timer. The Server pushed the change.

This is what makes multi-server systems practical. An AI assistant connected to five MCP Servers does not need five polling loops. Each Server pushes changes as they happen, and the Client responds only when something actually changes.

MCP as a Control Plane

Everything in this article — the lifecycle, the tool discovery, and the notifications - might look like a standard client-server API. The mechanics are similar. The difference is what they add up to.

A traditional API integration is static by nature. The developer reads the documentation, writes the code, and the integration is hardcoded from that point forward. When the downstream system changes, a developer changes the integration. There is no mechanism for the system to surface its own capability changes at runtime.

MCP shifts this in a specific direction: capability is declared at runtime, not hardcoded at build time. The AI discovers what tools exist, reasons about when to use them, and responds to changes in the server's capability list without a deployment cycle in between. That is the architectural difference that makes MCP a coordination layer rather than just another API tier.

In practice, this means the decisions that matter most are not in the protocol — they are in how you design the servers it connects. Which tools you expose, how narrowly you scope them, and what you put in the descriptions: these choices determine whether an AI-connected system behaves predictably or not.

Key Takeaways

Tool descriptions are the LLM's decision interface. Write them as documentation, not labels. An unclear description means the model never calls your tool, regardless of how well it is implemented.
Capability negotiation happens before any tool is called. Intentions are declared upfront, not assumed at runtime. This prevents an entire class of integration failures.
Notifications eliminate polling. Servers push changes. Clients react. This is what makes multi-server systems practical at scale.

MCP reduces the cost of connecting systems. It does not reduce the responsibility of designing them correctly.

Next: Part 4 - MCP vs Everything Else

Part of AI in Practice — three practical series on MCP, RAG, and AI Agents, focused on why these patterns exist, where they break, and how to think through the engineering decisions behind them.

DEV Community

MCP in Practice — Part 3: How MCP Works — The Complete Request Flow

The End-to-End Request Flow

JSON-RPC Basics - Just Enough to Read the Wire

MCP Protocol Lifecycle

Phase 1: Initialize

Phase 2: Negotiate

Phase 3: Ready

Tool Discovery and Execution

Step 1 — Discover: `tools/list`

Step 2 — Execute: `tools/call`

Notifications and Dynamic Systems

MCP as a Control Plane

Key Takeaways

Top comments (0)

The End-to-End Request Flow

JSON-RPC Basics - Just Enough to Read the Wire

MCP Protocol Lifecycle

Phase 1: Initialize

Phase 2: Negotiate

Phase 3: Ready

Tool Discovery and Execution

Step 1 — Discover: tools/list

Step 2 — Execute: tools/call

Notifications and Dynamic Systems

MCP as a Control Plane

Key Takeaways

Step 1 — Discover: `tools/list`

Step 2 — Execute: `tools/call`