If you've ever tried to build a truly interactive application, especially one that talks to a modern AI, you know the struggle. Simple request-response cycles feel clunky for long-running tasks. WebSockets are powerful but can be overkill and a headache to manage. So, how do you build something that feels as fluid and responsive as a native app, but over plain old HTTP?
Today, we're going on a deep, deep dive into a protocol designed to solve this exact problem: the MCP Streamable HTTP protocol. We're not just scratching the surface. We're going to tear down the client and server implementations from the official TypeScript SDK, look at the code, and understand the "why" behind every design choice.
By the end of this (admittedly very long) post, you'll have a rock-solid understanding of how to build robust, stateful, and even fault-tolerant applications on top of HTTP. Let's get started!
Part 1: The Big Picture - Why This Protocol Even Exists
The Philosophy: More Than Just Requests and Responses
At its heart, the Streamable HTTP protocol is built on a clever, dual-channel model that uses standard HTTP methods in unconventional ways. Think of it like a restaurant's communication system:
- The Command Channel (HTTP - POST🗣️): This is your direct line to the kitchen. You use it to place an order (send a request or notification). The magic here is that the waiter (the server) can talk back to you on that same line while your order is being prepared. They might give you progress updates ("The chef is searing the steak now!") or even partial results ("Here are your appetizers while you wait."). This is all handled within the response to your single- POSTrequest, which can itself be a stream of events.
- The Announcement Channel (HTTP - GET📢): This is the restaurant's PA system. You tune in once (by making a long-lived- GETrequest) and then you can hear any general announcements from the restaurant ("The special for tonight is..."). These are unsolicited, server-initiated events that aren't tied to any specific order you placed.
This design gives us the best of both worlds: the familiar, direct nature of POST for commands, and the persistent, low-latency nature of a GET-based Server-Sent Events (SSE) stream for asynchronous updates. The entire system is brought to life by the implementations in client/streamableHttp.ts and server/streamableHttp.ts.
  
  
  The Brains of the Operation: The Abstract Protocol Class 🧠
Before we even get to the HTTP part, we need to understand the core logic layer: the abstract Protocol class found in shared/protocol.ts. Think of the HTTP transport layers as the plumbing (the pipes and wires), but this Protocol class is the brain that decides what flows through them. It handles the nitty-gritty of JSON-RPC 2.0 framing, request lifecycles, and reliability.
How Requests and Responses are Matched
When your application code calls client.request(...), how does it know which response belongs to it, especially when multiple requests are happening at once?
It all starts with a unique ID. The Protocol class maintains a counter, _requestMessageId, and assigns a new ID to every outgoing request. It then creates a Promise and cleverly stores its resolve and reject functions in a Map called _responseHandlers, using the message ID as the key.
Here's that critical piece of code. It's the moment the client makes a promise it intends to keep.
// From: @modelcontextprotocol/typescript-sdk/src/shared/protocol.ts
// The client is setting a trap. It's saying, "When a response with `messageId`
// arrives, execute this function to either resolve or reject my promise."
this._responseHandlers.set(messageId, (response) => {
  // First, check if the request was already cancelled by our side.
  if (options?.signal?.aborted) {
    return;
  }
  // If the response is an error, reject the promise.
  if (response instanceof Error) {
    return reject(response);
  }
  // If it's a success, parse the result against the expected schema and resolve!
  try {
    const result = resultSchema.parse(response.result);
    resolve(result);
  } catch (error) {
    // If the server's response doesn't match our expected shape, that's an error too.
    reject(error);
  }
});
When a message arrives from the server, the transport's onmessage handler passes it to the Protocol class, which acts as a triage nurse. If the message has a result or error field, it knows it's a response and calls _onresponse. This function is the other half of the trap: it grabs the ID from the response, finds the corresponding handler in _responseHandlers, and springs it, fulfilling the promise.
Handling In-Flight Cancellations Gracefully
What if a user gets impatient and wants to cancel a long-running operation? The protocol has a clean way to handle this using the standard AbortSignal.
-  The client application triggers an AbortSignal.
-  The request()method catches this, rejects its promise locally, and, crucially, sends anotifications/cancelledmessage to the server.
-  The server's Protocolinstance has a pre-registered handler specifically for this notification. This handler looks up the task'sAbortController(which it stored when the request first arrived) and calls.abort(), signaling the server-side code to stop its work.
Keeping Long-Running Tasks Alive with Timeouts
To prevent requests from hanging forever, the Protocol class has a smart timeout system. When a request is made, it starts a timer. The real magic, however, is in the resetTimeoutOnProgress option. For a long AI task, you don't want it to time out just because it's taking a while. If this option is true, every time the server sends a progress notification, the client resets the timeout timer. This ensures that as long as the server is showing signs of life, the client will wait patiently.
// From: @modelcontextprotocol/typescript-sdk/src/shared/protocol.ts
// This method is called when a progress notification arrives.
private _resetTimeout(messageId: number): boolean {
    const info = this._timeoutInfo.get(messageId);
    if (!info) return false;
    // It even checks against a `maxTotalTimeout` so it can't be extended forever.
    const totalElapsed = Date.now() - info.startTime;
    if (info.maxTotalTimeout && totalElapsed >= info.maxTotalTimeout) {
      this._timeoutInfo.delete(messageId);
      throw new McpError(/* ... */);
    }
    // Clear the old timer and start a new one!
    clearTimeout(info.timeoutId);
    info.timeoutId = setTimeout(info.onTimeout, info.timeout);
    return true;
}
Part 2: The Client's Perspective - Making the Connection
Now let's dive into the concrete client implementation in client/streamableHttp.ts.
The Client Handshake: Connection, Init, and Auth
A client's journey to a full connection is a precise dance:
-  The initializePOST: The first thing a client does isPOSTaninitializemessage. This is the formal handshake where the client tells the server who it is and what it can do.
-  The 202 AcceptedTrigger: The server responds with an HTTP202 Accepted. This is the signal! The client'ssend()method sees this and immediately knows it's time to open the second channel.
-  The Asynchronous GET: The client immediately calls_startOrAuthSse(), which fires off a long-livedGETrequest with anAccept: text/event-streamheader. This is the client opening its ear for the server's PA system. If the server doesn't support this (and returns a405 Method Not Allowed), the client gracefully carries on without it.
-  The Auth Flow: If at any point the server responds with 401 Unauthorized, the client'sauthProviderkicks in. It might try to refresh a token, or if it has no credentials, it will triggerredirectToAuthorization, sending the user off to log in. Once they return, the application callsfinishAuth()to complete the OAuth2 flow and get the tokens needed to retry the connection.
  
  
  The Client's Gateway: A Forensic Look at the send() Method
Every single message the client sends goes through the send() method via POST. The true genius of the client is how it interprets the response to this POST.
-   If status is 202 Accepted: This is the "message received, thanks" signal. If the message wasinitialize, this is the cue to start the SSEGETstream, as we saw above.
-   If status is 200 OKand Content-Type isapplication/json: This is a simple, synchronous-style response. The client parses the JSON and is done with this transaction.
-   If status is 200 OKand Content-Type istext/event-stream: This is where it gets really cool. ThePOSTrequest's response itself is a stream. The client pipes this stream into_handleSseStreamto process the progress updates and final result for that specific request.
This logic is the heart of the client's flexibility.
// From: @modelcontextprotocol/typescript-sdk/src/client/streamableHttp.ts
// This block in send() decides what to do based on the server's response.
if (response.status === 202) {
    // If the server accepted our initialization...
    if (isInitializedNotification(message)) {
      // ...it's time to open the general announcement (GET) channel!
      this._startOrAuthSse({ resumptionToken: undefined }).catch(err => this.onerror?.(err));
    }
    return;
}
const contentType = response.headers.get("content-type");
if (hasRequests) {
    if (contentType?.includes("text/event-stream")) {
        // The POST response is a stream! Handle it accordingly.
        this._handleSseStream(response.body, { onresumptiontoken }, false);
    } else if (contentType?.includes("application/json")) {
        // The POST response is a simple JSON object. Parse it.
        const data = await response.json();
        // ... process data ...
    }
}
  
  
  From Bytes to Messages: Parsing Streams with _handleSseStream
This method is the designated parser for all SSE streams, whether from the main GET or a streaming POST. It sets up a beautiful, modern stream processing pipeline:
Raw Bytes (ReadableStream<Uint8Array>) → Decoded Text (TextDecoderStream) → Parsed Events (EventSourceParserStream)
It then reads from the end of this pipeline, taking the event.data (which is the JSON payload), parsing it, and passing it back to the Protocol layer's main onmessage callback for routing. Simple, efficient, and non-blocking.
// From: @modelcontextprotocol/typescript-sdk/src/client/streamableHttp.ts
// This is a masterclass in modern stream processing in JavaScript.
const reader = stream
  .pipeThrough(new TextDecoderStream())
  .pipeThrough(new EventSourceParserStream())
  .getReader();
Surviving the Chaos: Session Management and Resumability 🛡️
This is where the protocol truly shines, providing statefulness and recovery from network failures.
Session Management: The client grabs the mcp-session-id header from the very first response and stores it. From then on, every subsequent request includes this header, telling the server, "Hey, it's me again."
Connection Resumability: This is the critical flow for fault tolerance.
-  Capture the Token: When handling a stream, if an SSE event has an idfield, that's our resumption token! The client captures it aslastEventIdand calls theonresumptiontokencallback so the application can save it somewhere safe (likelocalStorage).
-  Detect the Disconnect: If the network drops, the stream will error out. The catchblock in_handleSseStreamis triggered.
-  Schedule a Reconnect: Instead of giving up, the catchblock calls_scheduleReconnection, which uses an exponential backoff delay to plan its next attempt.
-  Attempt Resumption: After the delay, it calls _startOrAuthSseagain, but this time it passes in thelastEventIdit saved.
-  Send the Magic Header: _startOrAuthSsethen creates a newGETrequest, but with a speciallast-event-idheader containing the token. This tells the server exactly where the client left off, allowing it to replay any missed messages.
It's a complete, closed-loop system for recovering from connection failures.
Part 3: The Server's Side of the Story
Now let's flip the table and look at the server implementation in server/streamableHttp.ts.
  
  
  The Server's Front Door: handleRequest and the Transport Lifecycle
The simpleStreamableHttp.ts example server shows a beautiful pattern for managing stateful connections. It maintains a global transports map.
-   When a request comes in, it checks for an mcp-session-idheader.
-   If the ID exists in the map, it reuses the existing StreamableHTTPServerTransportinstance for that session. State is maintained!
-   If there's no ID but the message is initialize, it knows a new client is connecting. It creates a new transport instance. The key is theonsessioninitializedcallback: once the new transport generates its session ID, this callback fires and saves the new transport into the global map.
This logic is the core of how the server manages multiple, distinct client sessions concurrently.
// From: @modelcontextprotocol/typescript-sdk/src/examples/server/simpleStreamableHttp.ts
// This logic from the example server is the key to stateful session management.
if (sessionId && transports[sessionId]) {
  // Found an existing session, let's reuse its transport.
  transport = transports[sessionId];
} else if (!sessionId && isInitializeRequest(req.body)) {
  // A new client is initializing!
  transport = new StreamableHTTPServerTransport({
    sessionIdGenerator: () => randomUUID(),
    // This callback is the magic glue. It links the new session ID to its transport instance.
    onsessioninitialized: (newlyCreatedSessionId) => {
      console.log(`Session initialized with ID: ${newlyCreatedSessionId}`);
      transports[newlyCreatedSessionId] = transport;
    }
  });
  // ...
}
  
  
  Intelligent Routing: The Server's send() Method
The server's send() method is the mirror image of the client's and is responsible for routing outbound messages to the correct channel. The deciding factor is the relatedRequestId.
-   Case 1: General Announcement. If send()is called for a notification without arelatedRequestId, the server knows it's a general, server-initiated event. It looks up theServerResponseobject for the long-livedGETstream and writes the message there.
-   Case 2: Specific Response. If send()is called for a message that is a response (it has anid) or has arelatedRequestId, the server knows it belongs to a specificPOSTtransaction. It uses its internal mappings (_requestToStreamMapping) to find the exactServerResponseobject associated with that originalPOSTand writes the message to that dedicated stream.
This ensures that progress updates for Tool A don't accidentally get sent to the response stream for Tool B.
Choose Your Weapon: Streaming SSE vs. Synchronous JSON
The server can respond in two ways, controlled by the enableJsonResponse option.
-   Streaming SSE (default): When a POSTarrives, the server immediately sends back200 OKwith atext/event-streamcontent type. The connection is now an open stream, and the server can send events over it as they become available.
-   Synchronous JSON: If enableJsonResponseistrue, the server holds its horses. It doesn't send any response right away. It buffers all the results for the request batch in memory. Only when the entire batch is complete does it send a single200 OKwith anapplication/jsoncontent type and the full JSON payload. This is perfect for simple tools where streaming is unnecessary.
  
  
  Picking Up Where You Left Off: The EventStore
The server's half of the resumability feature is powered by the EventStore interface.
-   Storing Events: In the send()method, if an event store is configured, the server first callseventStore.storeEvent(). This saves the message and returns a uniqueeventId. This ID is then sent as theid:field in the SSE message to the client.
-   Replaying Events: When a client reconnects with a last-event-idheader, the server'shandleGetRequestmethod catches it. It then callseventStore.replayEventsAfter(), which fetches all the messages the client missed and sends them down the new connection, seamlessly restoring the client's state.
Part 4: Putting It All to Work: Practical Scenarios
So, what can you actually build with this?
- Long-Running AI Tools: Imagine you're building a "Research Agent" tool. The user gives it a topic. The - POSTrequest is sent. The server can now stream back updates on the dedicated response stream:- {"status": "Searching web..."},- {"status": "Found 10 sources, summarizing..."},- {"status": "Generating report..."}, followed by the final text. It's a long task made interactive.
- Interactive User Input (Elicitation): Your AI needs the user's permission to access a file. It can send an - elicitInputrequest over the general announcement (- GET) channel. Your client app sees this, pops up a native "Allow Access?" dialog, and sends the- yes/noanswer back to the server. This is a fluid, two-way conversation.
- Real-Time Dashboards: Imagine a server monitoring system resources. The server can have multiple client dashboards connected via the - GETstream. Whenever CPU usage changes, the server just- send()s a- cpu_usage_changednotification, and all connected dashboards update in real-time.
SSE vs. Streamable HTTP: An Evolution in Design
You've almost certainly encountered Server-Sent Events (SSE). It's a fantastic, simple technology for pushing data from a server to a client. But the Streamable HTTP protocol looks like SSE and smells like SSE... yet it's not quite the same. So, are they the same thing? Is one better? Why is this new protocol necessary?
This section clears up that confusion. We'll explore how Streamable HTTP evolves the concepts of SSE to create a more powerful, robust, and truly bidirectional communication channel over standard HTTP.
The TL;DR: Two Phones vs. One Smartphone
Before we dive into the technical details, let's start with a simple analogy that captures the core difference.
- 
Classic SSE (+ separate POSTs) is like using two separate, old-school phones: -   You have a landline phone (GET) that can only receive calls. The server holds this line open to talk to you whenever it wants.
-   You have a payphone (POST) that can only make calls. Every time you want to say something to the server, you have to go to the payphone, make a call, say your piece, and hang up.
- This system is often asymmetric and requires extra work to correlate the incoming calls on the landline with the outgoing calls from the payphone.
 
-   You have a landline phone (
- 
Streamable HTTP is like a modern smartphone call: -   You make a single call (POST).
- On this one call, you can both talk to the server (by sending your request) and the server can talk back to you continuously (by streaming a response). It can even send you "text messages" (progress updates) during the call without interrupting the main conversation.
-   You also have the option of opening a separate, "listen-only" channel (GET), like putting the server on speakerphone for background announcements, but it's not required for a two-way conversation.
 
-   You make a single call (
This analogy captures the essence: classic SSE setups require two separate, asymmetric channels to achieve two-way communication, while the Streamable HTTP protocol can unify this into a single, more powerful HTTP transaction.
A Feature-by-Feature Protocol Showdown
Here, we'll break down the core concepts of real-time communication and compare how each protocol handles them.
1. The Connection & Communication Model
This is the most fundamental difference and the source of most of the architectural changes.
| Attribute | "Classic" SSE-based Approach | Streamable HTTP | 
|---|---|---|
| Primary Channel(s) | Two separate channels: 1. A persistent GETfor server-to-client messages.2. Separate, transient POSTs for client-to-server messages. | Unified hybrid channel: A single POSTcan handle both the client's request and a streaming server-to-client response. A separateGETchannel is optional for unsolicited server events. | 
| The Handshake | Often ad-hoc & asymmetric. For example, the client connects via GET, then must wait for a custom event from the server to learn where to send itsPOSTs. | Implicit & Flexible. The client sends an initializePOST. The server's response (202 Acceptedor200 OK) dictates the next step. No custom handshake event is needed. | 
| Flexibility | Rigid. The two-channel model is the only way it operates. | Highly Flexible. A server can choose to respond to a POSTwith a single JSON object (classic RPC) or a full event stream, depending on the nature of the request. | 
The key innovation here is that Streamable HTTP allows the response to a POST request to be, itself, a stream. This turns a traditionally one-shot request into a long-lived conversation scoped to a single transaction.
2. Session & State Management
How do the client and server keep track of who they're talking to?
| Attribute | "Classic" SSE-based Approach | Streamable HTTP | 
|---|---|---|
| Session Initiation | Often handled via query parameters. The session ID might be created by the server and sent back in a URL within a custom event. | Session ID is created by the server and sent back in a dedicated HTTP header ( mcp-session-id). | 
| Session Tracking | The client must parse the session ID and manually add it to subsequent POSTs. The server needs an application-level map to link thePOSTback to the originalGETstream. | The client simply reads the mcp-session-idheader and adds it to all subsequent requests. The transport layer can handle the session mapping more cleanly. | 
The key takeaway here is that Streamable HTTP uses standard HTTP mechanisms (headers) for state management, which is cleaner and less burdensome on the application developer compared to ad-hoc solutions using query parameters and custom events.
3. Resumability & Reliability
What happens when your mobile network drops mid-request? This is where Streamable HTTP truly shines.
| Attribute | "Classic" SSE-based Approach | Streamable HTTP | 
|---|---|---|
| Connection Resumption | Not natively supported. The SSE standard itself has a last-event-idheader, but a full protocol for replaying missed events across bothGETandPOSTchannels is not defined. If theGETstream is dropped, the client must typically start over. | First-class feature. This is one of the primary reasons for the protocol's existence. | 
| Mechanism | N/A | Token-based. 1. Server sends an id:field with each SSE event. This is the resumption token.2. Client persists the last seen token. 3. On reconnect, the client sends a last-event-idHTTP header.4. Server uses a persistent EventStoreto replay any missed messages. | 
| Server-Side Requirement | N/A | Requires a pluggable EventStorecomponent on the server to persist message history for replay, making the system fault-tolerant. | 
This makes applications built on Streamable HTTP incredibly resilient to the transient network issues common on mobile and unreliable networks.
4. Key Features & Use Cases
What kind of applications are each of these protocols best suited for?
| Attribute | "Classic" SSE-based Approach | Streamable HTTP | 
|---|---|---|
| Progress Updates | Clunky. The server can send notifications on the GETstream, but they aren't directly tied to thePOSTrequest that initiated the task. Correlating them requires extra logic. | Seamless. A long-running tool call is made via POST. The server can stream progress updates back in the response body of that same POST, keeping everything neatly scoped to a single transaction. | 
| Interactive Elicitation | Possible, but awkward. The server would send a request on the GETstream. The client would respond with a newPOST. The server then has to correlate thatPOSTwith its original request. | Natural. This is a core use case. The server can send a request on the optional standalone GETstream at any time, enabling true, back-and-forth conversational AI. | 
| Ideal Use Case | Simple, one-way server-to-client notification systems. (e.g., "A new article was posted!", stock tickers). | Complex, stateful, interactive applications. (e.g., AI agents, long-running data processing tools, real-time collaborative dashboards). | 
The journey from a classic SSE-based architecture to the Streamable HTTP protocol is a perfect case study in software evolution. The classic approach is a clever solution that works, but it has architectural seams—the need for application-level session mapping, the lack of built-in resumability, and the clunky correlation of requests and responses.
The Streamable HTTP protocol is the direct result of learning from those seams. It re-imagines the flow to be more aligned with the nature of HTTP, creating a unified, more powerful, and vastly more resilient system.
I know that was a lot to take in, but hopefully, this deep dive has demystified the magic behind the MCP Streamable HTTP protocol. By understanding the code and the design choices, you're now equipped to leverage its full power.
Let me know your thoughts or questions in the comments below. Happy coding
 

 
    
Top comments (1)
Great article actually going to some detail about what happens behind the scenes with MCP clients and servers, thank you!