If you've ever tried to build a truly interactive application, especially one that talks to a modern AI, you know the struggle. Simple request-response cycles feel clunky for long-running tasks. WebSockets are powerful but can be overkill and a headache to manage. So, how do you build something that feels as fluid and responsive as a native app, but over plain old HTTP?
Today, we're going on a deep, deep dive into a protocol designed to solve this exact problem: the MCP Streamable HTTP protocol. We're not just scratching the surface. We're going to tear down the client and server implementations from the official TypeScript SDK, look at the code, and understand the "why" behind every design choice.
By the end of this (admittedly very long) post, you'll have a rock-solid understanding of how to build robust, stateful, and even fault-tolerant applications on top of HTTP. Let's get started!
Part 1: The Big Picture - Why This Protocol Even Exists
The Philosophy: More Than Just Requests and Responses
At its heart, the Streamable HTTP protocol is built on a clever, dual-channel model that uses standard HTTP methods in unconventional ways. Think of it like a restaurant's communication system:
The Command Channel (HTTP
POST
🗣️): This is your direct line to the kitchen. You use it to place an order (send a request or notification). The magic here is that the waiter (the server) can talk back to you on that same line while your order is being prepared. They might give you progress updates ("The chef is searing the steak now!") or even partial results ("Here are your appetizers while you wait."). This is all handled within the response to your singlePOST
request, which can itself be a stream of events.The Announcement Channel (HTTP
GET
📢): This is the restaurant's PA system. You tune in once (by making a long-livedGET
request) and then you can hear any general announcements from the restaurant ("The special for tonight is..."). These are unsolicited, server-initiated events that aren't tied to any specific order you placed.
This design gives us the best of both worlds: the familiar, direct nature of POST
for commands, and the persistent, low-latency nature of a GET
-based Server-Sent Events (SSE) stream for asynchronous updates. The entire system is brought to life by the implementations in client/streamableHttp.ts
and server/streamableHttp.ts
.
The Brains of the Operation: The Abstract Protocol
Class 🧠
Before we even get to the HTTP part, we need to understand the core logic layer: the abstract Protocol
class found in shared/protocol.ts
. Think of the HTTP transport layers as the plumbing (the pipes and wires), but this Protocol
class is the brain that decides what flows through them. It handles the nitty-gritty of JSON-RPC 2.0 framing, request lifecycles, and reliability.
How Requests and Responses are Matched
When your application code calls client.request(...)
, how does it know which response belongs to it, especially when multiple requests are happening at once?
It all starts with a unique ID. The Protocol
class maintains a counter, _requestMessageId
, and assigns a new ID to every outgoing request. It then creates a Promise
and cleverly stores its resolve
and reject
functions in a Map
called _responseHandlers
, using the message ID as the key.
Here's that critical piece of code. It's the moment the client makes a promise it intends to keep.
// From: @modelcontextprotocol/typescript-sdk/src/shared/protocol.ts
// The client is setting a trap. It's saying, "When a response with `messageId`
// arrives, execute this function to either resolve or reject my promise."
this._responseHandlers.set(messageId, (response) => {
// First, check if the request was already cancelled by our side.
if (options?.signal?.aborted) {
return;
}
// If the response is an error, reject the promise.
if (response instanceof Error) {
return reject(response);
}
// If it's a success, parse the result against the expected schema and resolve!
try {
const result = resultSchema.parse(response.result);
resolve(result);
} catch (error) {
// If the server's response doesn't match our expected shape, that's an error too.
reject(error);
}
});
When a message arrives from the server, the transport's onmessage
handler passes it to the Protocol
class, which acts as a triage nurse. If the message has a result
or error
field, it knows it's a response and calls _onresponse
. This function is the other half of the trap: it grabs the ID from the response, finds the corresponding handler in _responseHandlers
, and springs it, fulfilling the promise.
Handling In-Flight Cancellations Gracefully
What if a user gets impatient and wants to cancel a long-running operation? The protocol has a clean way to handle this using the standard AbortSignal
.
- The client application triggers an
AbortSignal
. - The
request()
method catches this, rejects its promise locally, and, crucially, sends anotifications/cancelled
message to the server. - The server's
Protocol
instance has a pre-registered handler specifically for this notification. This handler looks up the task'sAbortController
(which it stored when the request first arrived) and calls.abort()
, signaling the server-side code to stop its work.
Keeping Long-Running Tasks Alive with Timeouts
To prevent requests from hanging forever, the Protocol
class has a smart timeout system. When a request is made, it starts a timer. The real magic, however, is in the resetTimeoutOnProgress
option. For a long AI task, you don't want it to time out just because it's taking a while. If this option is true
, every time the server sends a progress notification, the client resets the timeout timer. This ensures that as long as the server is showing signs of life, the client will wait patiently.
// From: @modelcontextprotocol/typescript-sdk/src/shared/protocol.ts
// This method is called when a progress notification arrives.
private _resetTimeout(messageId: number): boolean {
const info = this._timeoutInfo.get(messageId);
if (!info) return false;
// It even checks against a `maxTotalTimeout` so it can't be extended forever.
const totalElapsed = Date.now() - info.startTime;
if (info.maxTotalTimeout && totalElapsed >= info.maxTotalTimeout) {
this._timeoutInfo.delete(messageId);
throw new McpError(/* ... */);
}
// Clear the old timer and start a new one!
clearTimeout(info.timeoutId);
info.timeoutId = setTimeout(info.onTimeout, info.timeout);
return true;
}
Part 2: The Client's Perspective - Making the Connection
Now let's dive into the concrete client implementation in client/streamableHttp.ts
.
The Client Handshake: Connection, Init, and Auth
A client's journey to a full connection is a precise dance:
- The
initialize
POST
: The first thing a client does isPOST
aninitialize
message. This is the formal handshake where the client tells the server who it is and what it can do. - The
202 Accepted
Trigger: The server responds with an HTTP202 Accepted
. This is the signal! The client'ssend()
method sees this and immediately knows it's time to open the second channel. - The Asynchronous
GET
: The client immediately calls_startOrAuthSse()
, which fires off a long-livedGET
request with anAccept: text/event-stream
header. This is the client opening its ear for the server's PA system. If the server doesn't support this (and returns a405 Method Not Allowed
), the client gracefully carries on without it. - The Auth Flow: If at any point the server responds with
401 Unauthorized
, the client'sauthProvider
kicks in. It might try to refresh a token, or if it has no credentials, it will triggerredirectToAuthorization
, sending the user off to log in. Once they return, the application callsfinishAuth()
to complete the OAuth2 flow and get the tokens needed to retry the connection.
The Client's Gateway: A Forensic Look at the send()
Method
Every single message the client sends goes through the send()
method via POST
. The true genius of the client is how it interprets the response to this POST
.
- If status is
202 Accepted
: This is the "message received, thanks" signal. If the message wasinitialize
, this is the cue to start the SSEGET
stream, as we saw above. - If status is
200 OK
and Content-Type isapplication/json
: This is a simple, synchronous-style response. The client parses the JSON and is done with this transaction. - If status is
200 OK
and Content-Type istext/event-stream
: This is where it gets really cool. ThePOST
request's response itself is a stream. The client pipes this stream into_handleSseStream
to process the progress updates and final result for that specific request.
This logic is the heart of the client's flexibility.
// From: @modelcontextprotocol/typescript-sdk/src/client/streamableHttp.ts
// This block in send() decides what to do based on the server's response.
if (response.status === 202) {
// If the server accepted our initialization...
if (isInitializedNotification(message)) {
// ...it's time to open the general announcement (GET) channel!
this._startOrAuthSse({ resumptionToken: undefined }).catch(err => this.onerror?.(err));
}
return;
}
const contentType = response.headers.get("content-type");
if (hasRequests) {
if (contentType?.includes("text/event-stream")) {
// The POST response is a stream! Handle it accordingly.
this._handleSseStream(response.body, { onresumptiontoken }, false);
} else if (contentType?.includes("application/json")) {
// The POST response is a simple JSON object. Parse it.
const data = await response.json();
// ... process data ...
}
}
From Bytes to Messages: Parsing Streams with _handleSseStream
This method is the designated parser for all SSE streams, whether from the main GET
or a streaming POST
. It sets up a beautiful, modern stream processing pipeline:
Raw Bytes (ReadableStream<Uint8Array>)
→ Decoded Text (TextDecoderStream)
→ Parsed Events (EventSourceParserStream)
It then reads from the end of this pipeline, taking the event.data
(which is the JSON payload), parsing it, and passing it back to the Protocol
layer's main onmessage
callback for routing. Simple, efficient, and non-blocking.
// From: @modelcontextprotocol/typescript-sdk/src/client/streamableHttp.ts
// This is a masterclass in modern stream processing in JavaScript.
const reader = stream
.pipeThrough(new TextDecoderStream())
.pipeThrough(new EventSourceParserStream())
.getReader();
Surviving the Chaos: Session Management and Resumability 🛡️
This is where the protocol truly shines, providing statefulness and recovery from network failures.
Session Management: The client grabs the mcp-session-id
header from the very first response and stores it. From then on, every subsequent request includes this header, telling the server, "Hey, it's me again."
Connection Resumability: This is the critical flow for fault tolerance.
- Capture the Token: When handling a stream, if an SSE event has an
id
field, that's our resumption token! The client captures it aslastEventId
and calls theonresumptiontoken
callback so the application can save it somewhere safe (likelocalStorage
). - Detect the Disconnect: If the network drops, the stream will error out. The
catch
block in_handleSseStream
is triggered. - Schedule a Reconnect: Instead of giving up, the
catch
block calls_scheduleReconnection
, which uses an exponential backoff delay to plan its next attempt. - Attempt Resumption: After the delay, it calls
_startOrAuthSse
again, but this time it passes in thelastEventId
it saved. - Send the Magic Header:
_startOrAuthSse
then creates a newGET
request, but with a speciallast-event-id
header containing the token. This tells the server exactly where the client left off, allowing it to replay any missed messages.
It's a complete, closed-loop system for recovering from connection failures.
Part 3: The Server's Side of the Story
Now let's flip the table and look at the server implementation in server/streamableHttp.ts
.
The Server's Front Door: handleRequest
and the Transport Lifecycle
The simpleStreamableHttp.ts
example server shows a beautiful pattern for managing stateful connections. It maintains a global transports
map.
- When a request comes in, it checks for an
mcp-session-id
header. - If the ID exists in the map, it reuses the existing
StreamableHTTPServerTransport
instance for that session. State is maintained! - If there's no ID but the message is
initialize
, it knows a new client is connecting. It creates a new transport instance. The key is theonsessioninitialized
callback: once the new transport generates its session ID, this callback fires and saves the new transport into the global map.
This logic is the core of how the server manages multiple, distinct client sessions concurrently.
// From: @modelcontextprotocol/typescript-sdk/src/examples/server/simpleStreamableHttp.ts
// This logic from the example server is the key to stateful session management.
if (sessionId && transports[sessionId]) {
// Found an existing session, let's reuse its transport.
transport = transports[sessionId];
} else if (!sessionId && isInitializeRequest(req.body)) {
// A new client is initializing!
transport = new StreamableHTTPServerTransport({
sessionIdGenerator: () => randomUUID(),
// This callback is the magic glue. It links the new session ID to its transport instance.
onsessioninitialized: (newlyCreatedSessionId) => {
console.log(`Session initialized with ID: ${newlyCreatedSessionId}`);
transports[newlyCreatedSessionId] = transport;
}
});
// ...
}
Intelligent Routing: The Server's send()
Method
The server's send()
method is the mirror image of the client's and is responsible for routing outbound messages to the correct channel. The deciding factor is the relatedRequestId
.
- Case 1: General Announcement. If
send()
is called for a notification without arelatedRequestId
, the server knows it's a general, server-initiated event. It looks up theServerResponse
object for the long-livedGET
stream and writes the message there. - Case 2: Specific Response. If
send()
is called for a message that is a response (it has anid
) or has arelatedRequestId
, the server knows it belongs to a specificPOST
transaction. It uses its internal mappings (_requestToStreamMapping
) to find the exactServerResponse
object associated with that originalPOST
and writes the message to that dedicated stream.
This ensures that progress updates for Tool A don't accidentally get sent to the response stream for Tool B.
Choose Your Weapon: Streaming SSE vs. Synchronous JSON
The server can respond in two ways, controlled by the enableJsonResponse
option.
- Streaming SSE (default): When a
POST
arrives, the server immediately sends back200 OK
with atext/event-stream
content type. The connection is now an open stream, and the server can send events over it as they become available. - Synchronous JSON: If
enableJsonResponse
istrue
, the server holds its horses. It doesn't send any response right away. It buffers all the results for the request batch in memory. Only when the entire batch is complete does it send a single200 OK
with anapplication/json
content type and the full JSON payload. This is perfect for simple tools where streaming is unnecessary.
Picking Up Where You Left Off: The EventStore
The server's half of the resumability feature is powered by the EventStore
interface.
- Storing Events: In the
send()
method, if an event store is configured, the server first callseventStore.storeEvent()
. This saves the message and returns a uniqueeventId
. This ID is then sent as theid:
field in the SSE message to the client. - Replaying Events: When a client reconnects with a
last-event-id
header, the server'shandleGetRequest
method catches it. It then callseventStore.replayEventsAfter()
, which fetches all the messages the client missed and sends them down the new connection, seamlessly restoring the client's state.
Part 4: Putting It All to Work: Practical Scenarios
So, what can you actually build with this?
Long-Running AI Tools: Imagine you're building a "Research Agent" tool. The user gives it a topic. The
POST
request is sent. The server can now stream back updates on the dedicated response stream:{"status": "Searching web..."}
,{"status": "Found 10 sources, summarizing..."}
,{"status": "Generating report..."}
, followed by the final text. It's a long task made interactive.Interactive User Input (Elicitation): Your AI needs the user's permission to access a file. It can send an
elicitInput
request over the general announcement (GET
) channel. Your client app sees this, pops up a native "Allow Access?" dialog, and sends theyes/no
answer back to the server. This is a fluid, two-way conversation.Real-Time Dashboards: Imagine a server monitoring system resources. The server can have multiple client dashboards connected via the
GET
stream. Whenever CPU usage changes, the server justsend()
s acpu_usage_changed
notification, and all connected dashboards update in real-time.
SSE vs. Streamable HTTP: An Evolution in Design
You've almost certainly encountered Server-Sent Events (SSE). It's a fantastic, simple technology for pushing data from a server to a client. But the Streamable HTTP protocol looks like SSE and smells like SSE... yet it's not quite the same. So, are they the same thing? Is one better? Why is this new protocol necessary?
This section clears up that confusion. We'll explore how Streamable HTTP evolves the concepts of SSE to create a more powerful, robust, and truly bidirectional communication channel over standard HTTP.
The TL;DR: Two Phones vs. One Smartphone
Before we dive into the technical details, let's start with a simple analogy that captures the core difference.
-
Classic SSE (+ separate POSTs) is like using two separate, old-school phones:
- You have a landline phone (
GET
) that can only receive calls. The server holds this line open to talk to you whenever it wants. - You have a payphone (
POST
) that can only make calls. Every time you want to say something to the server, you have to go to the payphone, make a call, say your piece, and hang up. - This system is often asymmetric and requires extra work to correlate the incoming calls on the landline with the outgoing calls from the payphone.
- You have a landline phone (
-
Streamable HTTP is like a modern smartphone call:
- You make a single call (
POST
). - On this one call, you can both talk to the server (by sending your request) and the server can talk back to you continuously (by streaming a response). It can even send you "text messages" (progress updates) during the call without interrupting the main conversation.
- You also have the option of opening a separate, "listen-only" channel (
GET
), like putting the server on speakerphone for background announcements, but it's not required for a two-way conversation.
- You make a single call (
This analogy captures the essence: classic SSE setups require two separate, asymmetric channels to achieve two-way communication, while the Streamable HTTP protocol can unify this into a single, more powerful HTTP transaction.
A Feature-by-Feature Protocol Showdown
Here, we'll break down the core concepts of real-time communication and compare how each protocol handles them.
1. The Connection & Communication Model
This is the most fundamental difference and the source of most of the architectural changes.
Attribute | "Classic" SSE-based Approach | Streamable HTTP |
---|---|---|
Primary Channel(s) |
Two separate channels: 1. A persistent GET for server-to-client messages. 2. Separate, transient POST s for client-to-server messages. |
Unified hybrid channel: A single POST can handle both the client's request and a streaming server-to-client response. A separate GET channel is optional for unsolicited server events. |
The Handshake |
Often ad-hoc & asymmetric. For example, the client connects via GET , then must wait for a custom event from the server to learn where to send its POST s. |
Implicit & Flexible. The client sends an initialize POST . The server's response (202 Accepted or 200 OK ) dictates the next step. No custom handshake event is needed. |
Flexibility | Rigid. The two-channel model is the only way it operates. |
Highly Flexible. A server can choose to respond to a POST with a single JSON object (classic RPC) or a full event stream, depending on the nature of the request. |
The key innovation here is that Streamable HTTP allows the response to a POST
request to be, itself, a stream. This turns a traditionally one-shot request into a long-lived conversation scoped to a single transaction.
2. Session & State Management
How do the client and server keep track of who they're talking to?
Attribute | "Classic" SSE-based Approach | Streamable HTTP |
---|---|---|
Session Initiation | Often handled via query parameters. The session ID might be created by the server and sent back in a URL within a custom event. | Session ID is created by the server and sent back in a dedicated HTTP header (mcp-session-id ). |
Session Tracking | The client must parse the session ID and manually add it to subsequent POST s. The server needs an application-level map to link the POST back to the original GET stream. |
The client simply reads the mcp-session-id header and adds it to all subsequent requests. The transport layer can handle the session mapping more cleanly. |
The key takeaway here is that Streamable HTTP uses standard HTTP mechanisms (headers) for state management, which is cleaner and less burdensome on the application developer compared to ad-hoc solutions using query parameters and custom events.
3. Resumability & Reliability
What happens when your mobile network drops mid-request? This is where Streamable HTTP truly shines.
Attribute | "Classic" SSE-based Approach | Streamable HTTP |
---|---|---|
Connection Resumption |
Not natively supported. The SSE standard itself has a last-event-id header, but a full protocol for replaying missed events across both GET and POST channels is not defined. If the GET stream is dropped, the client must typically start over. |
First-class feature. This is one of the primary reasons for the protocol's existence. |
Mechanism | N/A |
Token-based. 1. Server sends an id: field with each SSE event. This is the resumption token. 2. Client persists the last seen token. 3. On reconnect, the client sends a last-event-id HTTP header. 4. Server uses a persistent EventStore to replay any missed messages. |
Server-Side Requirement | N/A | Requires a pluggable EventStore component on the server to persist message history for replay, making the system fault-tolerant. |
This makes applications built on Streamable HTTP incredibly resilient to the transient network issues common on mobile and unreliable networks.
4. Key Features & Use Cases
What kind of applications are each of these protocols best suited for?
Attribute | "Classic" SSE-based Approach | Streamable HTTP |
---|---|---|
Progress Updates |
Clunky. The server can send notifications on the GET stream, but they aren't directly tied to the POST request that initiated the task. Correlating them requires extra logic. |
Seamless. A long-running tool call is made via POST . The server can stream progress updates back in the response body of that same POST, keeping everything neatly scoped to a single transaction. |
Interactive Elicitation |
Possible, but awkward. The server would send a request on the GET stream. The client would respond with a new POST . The server then has to correlate that POST with its original request. |
Natural. This is a core use case. The server can send a request on the optional standalone GET stream at any time, enabling true, back-and-forth conversational AI. |
Ideal Use Case | Simple, one-way server-to-client notification systems. (e.g., "A new article was posted!", stock tickers). | Complex, stateful, interactive applications. (e.g., AI agents, long-running data processing tools, real-time collaborative dashboards). |
The journey from a classic SSE-based architecture to the Streamable HTTP protocol is a perfect case study in software evolution. The classic approach is a clever solution that works, but it has architectural seams—the need for application-level session mapping, the lack of built-in resumability, and the clunky correlation of requests and responses.
The Streamable HTTP protocol is the direct result of learning from those seams. It re-imagines the flow to be more aligned with the nature of HTTP, creating a unified, more powerful, and vastly more resilient system.
I know that was a lot to take in, but hopefully, this deep dive has demystified the magic behind the MCP Streamable HTTP protocol. By understanding the code and the design choices, you're now equipped to leverage its full power.
Let me know your thoughts or questions in the comments below. Happy coding
Top comments (0)