DEV Community

Zrcic
Zrcic

Posted on

Deep Dive: SSE vs HTTP Streamable — What's the Difference?

You might be wondering: "Both SSE and HTTP Streamable work over HTTP, so why can't SSE do everything HTTP Streamable does?" Great question. The answer lies in how they manage connections and state.

This is a companion article to Understanding MCP Server Transports: STDIO, SSE, and HTTP Streamable. If you're new to MCP transports, start there first.

How SSE Works Under the Hood

SSE uses a persistent, long-lived connection:

1. Client opens connection    GET /sse HTTP/1.1
                              Accept: text/event-stream

2. Server keeps it open       HTTP/1.1 200 OK
                              Content-Type: text/event-stream

                              data: {"event": "connected"}

                              data: {"event": "update"}

                              ... connection stays open ...

3. Client sends requests      POST /messages?sessionId=abc123
   on separate channel        {"method": "callTool", ...}
Enter fullscreen mode Exit fullscreen mode

The key characteristics:

  • Connection-based: Server maintains an open connection for each client
  • Stateful by nature: Server must track each connection in memory
  • Two channels: SSE stream for server→client, POST requests for client→server
  • Session affinity required: The client's POST requests must reach the same server instance holding their SSE connection

How HTTP Streamable Works Under the Hood

HTTP Streamable uses standard request/response with optional streaming:

1. Client sends request       POST /mcp HTTP/1.1
                              Content-Type: application/json
                              Mcp-Session-Id: abc123

                              {"method": "callTool", ...}

2. Server responds            HTTP/1.1 200 OK
   (can stream if needed)     Content-Type: application/json

                              {"result": ...}

3. Connection closes          (or streams multiple responses)
Enter fullscreen mode Exit fullscreen mode

The key characteristics:

  • Request/response based: Each interaction is a complete HTTP request
  • Stateless capable: Server doesn't need to hold connections open
  • Single channel: Everything goes through the same endpoint
  • No affinity required: Any server instance can handle any request

Why SSE Can't Match HTTP Streamable's Capabilities

1. The Persistent Connection Problem

SSE requires the server to maintain an open connection for every client. This creates real problems at scale:

SSE with 1,000 clients:
┌─────────────────────────────────────┐
│           SSE Server                │
│  ┌─────┐ ┌─────┐ ┌─────┐           │
│  │Conn1│ │Conn2│ │...  │ │Conn1000││
│  └─────┘ └─────┘ └─────┘           │
│  Memory: ~1KB × 1000 = 1MB+        │
│  File descriptors: 1000            │
│  Cannot simply restart server      │
└─────────────────────────────────────┘

HTTP Streamable with 1,000 clients:
┌─────────────────────────────────────┐
│      HTTP Streamable Server         │
│                                     │
│  No persistent connections          │
│  Memory: only during requests       │
│  Can restart/deploy anytime         │
└─────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

With SSE, if your server restarts (for a deployment, crash, or scaling event), all 1,000 clients lose their connections and must reconnect. With HTTP Streamable, clients just send their next request — they don't even notice.

2. The Load Balancing Problem

With SSE, you need "sticky sessions" — the load balancer must route each client to the same server:

SSE Load Balancing (Complex):
                         ┌─ Server A (holds Client 1's connection)
Client 1 ─► Load ────────┤
            Balancer     │  Must remember: Client 1 → Server A
            (sticky)     │                 Client 2 → Server B
Client 2 ─►─────────────┼─ Server B (holds Client 2's connection)

If Server A dies, Client 1 loses connection and must reconnect!
Enter fullscreen mode Exit fullscreen mode
HTTP Streamable Load Balancing (Simple):
                         ┌─ Server A
Client 1 ─► Load ────────┤
            Balancer     │  Any request → Any server
            (stateless)  │
Client 2 ─►─────────────┼─ Server B

If Server A dies, next request just goes to Server B!
Enter fullscreen mode Exit fullscreen mode

3. The Authentication Problem

SSE authentication is awkward because the EventSource API (used in browsers) doesn't support custom headers:

// SSE: Can't set Authorization header easily
const eventSource = new EventSource('/sse'); // No header support!

// Workarounds are hacky:
// - Pass token in URL: /sse?token=xyz (visible in logs, insecure)
// - Use cookies (but what about non-browser clients?)
Enter fullscreen mode Exit fullscreen mode

HTTP Streamable uses standard HTTP requests where auth is straightforward:

// HTTP Streamable: Standard auth headers
fetch('/mcp', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer xyz',  // Clean and standard
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({...})
});
Enter fullscreen mode Exit fullscreen mode

4. The Infrastructure Problem

Many infrastructure tools assume short-lived HTTP connections:

Infrastructure SSE Support HTTP Streamable Support
AWS Lambda ❌ Times out ✅ Perfect fit
Cloud Run ⚠️ Needs config ✅ Works by default
Cloudflare ⚠️ Connection limits ✅ Standard HTTP
API Gateways ⚠️ Often problematic ✅ Native support
Corporate proxies ⚠️ May kill long connections ✅ No issues

When SSE Still Makes Sense

Despite these limitations, SSE isn't obsolete. Use it when:

  • You need true push notifications: SSE excels when the server needs to push updates unprompted (though HTTP Streamable can also stream)
  • Browser-native simplicity: EventSource API is dead simple for basic use cases
  • Existing SSE infrastructure: If you've already built around SSE, migration may not be worth it
  • Single-server deployments: The scaling issues don't matter if you only have one server

The Bottom Line

SSE and HTTP Streamable both use HTTP, but they represent fundamentally different architectural patterns:

Aspect SSE HTTP Streamable
Connection model Persistent Request/Response
Server state Must maintain connections Can be stateless
Scaling model Vertical (bigger servers) Horizontal (more servers)
Load balancing Sticky sessions required Stateless routing
Auth support Awkward Native HTTP auth
Infrastructure fit Specialized Universal

SSE = Keep a phone line open
HTTP Streamable = Send text messages

Both communicate, but one requires maintaining an open line while the other sends discrete messages that can be routed anywhere.

What Should You Choose?

For most new MCP projects, HTTP Streamable is the better choice. It's more flexible, scales better, and works with modern cloud infrastructure out of the box.

Use SSE only if you have a specific reason—like existing infrastructure or a genuine need for server-initiated push without polling.


Resources

Top comments (0)