MCP Transports Explained: stdio vs Streamable HTTP (and When to Use Each)
You've heard about MCP. You know it's the emerging standard for connecting your data and services to AI agents. You've maybe even skimmed the spec. But then you hit a fork in the road: stdio or Streamable HTTP?
If you're building an MCP server — whether it wraps a database, an internal API, a CRM, or a pile of PDFs — the transport you choose determines how your AI client talks to your server. Pick wrong and you're fighting infrastructure instead of shipping features.
Here's the practical breakdown.
Two Transports, Two Worlds
MCP defines exactly two official transports. Not three, not five. Two. This is intentional — it keeps the ecosystem interoperable without fracturing into a dozen competing wire protocols.
stdio — Your MCP server runs as a local subprocess. The AI client launches it, pipes JSON-RPC messages through stdin/stdout, and kills it when done. Think of it like a CLI tool that happens to speak a structured protocol.
Streamable HTTP — Your MCP server runs as a standalone web service. The client talks to it over HTTP POST and GET requests to a single endpoint (e.g., https://your-server.com/mcp). The server can optionally use Server-Sent Events to stream responses back. Think of it like a REST API that also supports real-time streaming.
That's the entire menu.
stdio: The Local Workhorse
stdio is the simplest thing that could possibly work. The client spawns your server as a child process, writes JSON-RPC to its stdin, and reads responses from its stdout. No ports, no URLs, no TLS certificates, no CORS headers.
Reach for stdio when:
- Your server and client live on the same machine (Claude Desktop, Cursor, VS Code, a local dev environment)
- You're building a personal tool — a wrapper around your local filesystem, a local database, or a CLI utility
- You want zero infrastructure overhead — no hosting, no auth, no network config
- You're prototyping and want the fastest path to "it works"
- Your server is a single-user tool that doesn't need to handle concurrent connections
The tradeoffs you accept:
- The server binary must be installed on every machine that uses it
- No remote access — if the client isn't on the same box, it can't connect
- Updates mean reinstalling on every machine
- One client, one server process — no shared state across users
- You can't write to stdout for logging (it corrupts the protocol stream), so you're limited to stderr or file-based logs
A typical stdio config looks like this (in Claude Desktop's claude_desktop_config.json):
{
"mcpServers": {
"my-db-tool": {
"command": "node",
"args": ["./my-mcp-server/index.js"],
"env": {
"DB_PATH": "/home/me/data.db"
}
}
}
}
The client launches the process, negotiates capabilities via the MCP handshake, and you're live. Dead simple.
Streamable HTTP: The Remote Standard
Streamable HTTP (introduced March 2025, replacing the older SSE-only transport) is how MCP works over the network. Your server exposes a single HTTP endpoint. Clients POST requests to it. The server responds with either a direct JSON response or opens an SSE stream when it needs to push multiple messages back.
Reach for Streamable HTTP when:
- Multiple users or clients need to connect to the same server
- Your MCP server wraps a remote service, cloud API, or SaaS product
- You want centralized deployment — update once, everyone gets the new version
- You need authentication and access control
- You're building a product or service that others will consume
- You need the server to run independently of any specific client lifecycle
The tradeoffs you accept:
- You're running a web server, with everything that entails: hosting, TLS, CORS, Origin validation
- Session management adds complexity (the spec supports optional
Mcp-Session-Idheaders) - You need to think about scaling — though the protocol is moving toward stateless patterns to make this easier
- More moving parts means more things that can break in production
A minimal Streamable HTTP server (Python with FastMCP):
from fastmcp import FastMCP
mcp = FastMCP("my-service")
@mcp.tool()
def search_customers(query: str) -> str:
"""Search the customer database"""
# your logic here
return results
# Run as HTTP server
mcp.run(transport="streamable-http", host="0.0.0.0", port=8000)
Clients connect to http://your-host:8000/mcp and communicate over standard HTTP.
The Decision Framework
Here's how to think about it in practice:
| Question | stdio | Streamable HTTP |
|---|---|---|
| Where does the server run? | Same machine as the client | Anywhere on the network |
| How many users? | One | Many |
| How do you update? | Reinstall everywhere | Deploy once |
| Auth needed? | No (local process) | Yes (you handle it) |
| Infrastructure? | None | Web server + hosting |
| Best for? | Dev tools, personal utilities, prototyping | Products, shared services, SaaS integrations |
The one-sentence rule: If the person using the AI client also controls the machine the server runs on, use stdio. If they don't, use Streamable HTTP.
What About SSE? Isn't That a Third Option?
You'll see references to "SSE transport" in older docs and tutorials. This was MCP's original remote transport (spec version 2024-11-05). It required two separate endpoints — a GET for the SSE stream and a POST for client messages. It worked, but the dual-endpoint design was awkward.
Streamable HTTP replaced it. It consolidates everything into a single endpoint and makes SSE optional rather than mandatory — the server can use SSE for streaming when it needs to, or just return plain HTTP responses for simple request/response patterns.
If you're starting fresh, use Streamable HTTP. If you have an existing SSE server, the SDKs support backward compatibility, but plan your migration.
Why Only Two?
The MCP maintainers are explicit about this: two official transports keeps the ecosystem unified. Every MCP client and server can interoperate without negotiating which of seventeen wire protocols to use.
The spec does support custom transports for specialized needs — WebSockets, gRPC, carrier pigeon — but these are opt-in extensions, not part of the baseline. The two-transport constraint is a feature, not a limitation. It means when you build an MCP server, it works with Claude, Cursor, Windsurf, Roo Code, and whatever ships next month, without you thinking about transport compatibility.
Getting Started
If you're trying to connect your data or service to AI for the first time:
- Start with stdio. Get your tools working locally. Validate the integration. Make sure your tool descriptions are clear and your responses are useful.
- Move to Streamable HTTP when you need to. When you want to share the server with teammates, deploy it as a service, or let customers connect their AI clients to your platform — that's when HTTP earns its complexity.
- Don't overthink the transport. The interesting work is in your tools, resources, and prompts — the actual capabilities you're exposing. The transport is plumbing. Good plumbing matters, but it's not why people turn on the faucet.
Building MCP servers and want to deploy them instantly? OpZero is an MCP-native deployment bridge that ships to Cloudflare, Netlify, and Vercel from your AI agent. Built for the agentic workflow.
Top comments (2)
Excellent breakdown — the stdio vs HTTP distinction is one of the most underexplained decisions in MCP right now. The tradeoff you describe (stdio for local simplicity, HTTP for production scale) holds for most cases.
One dimension worth adding: transport choice also affects how you handle streaming progress back to the agent. With Streamable HTTP you get native streaming, which opens up things like real-time progress tokens during long-running operations — something agents can use to decide whether to wait or abort.
We've been building mcp-fusion (github.com/vinkius-labs/mcp-fusion) which has a
progress()generator pattern built in specifically for this. The framework supports both transports but the streaming story becomes much richer with HTTP. Good complement to what you're explaining here.Some comments may only be visible to logged-in visitors. Sign in to view all comments.