Rafael Silva

Posted on Mar 26 • Edited on Jun 12

Building a Streamable HTTP MCP Server: From stdio to Vercel Serverless

#ai #typescript #mcp #tutorial

The Model Context Protocol (MCP) is rapidly becoming the standard way AI agents discover and use tools. But most MCP servers today use the stdio transport — they run locally and communicate through standard input/output. That's fine for desktop use, but what about cloud deployment?

In this post, I'll walk through how I migrated an MCP server from stdio to Streamable HTTP, deployed it on Vercel's free tier, and got it listed on Smithery.ai — all in a single afternoon.

The Problem: stdio Doesn't Scale

🔥 UPDATE (June 12, 2026): Credit Optimizer was just featured in Web Tools Weekly #673 (4,800+ developers). Price increases from $12 → $19 on Friday June 14. Use code WTW20 for 20% off → $9.60 at creditopt.ai

When you build an MCP server with stdio transport, it works great locally:

npx -y @anthropic/mcp-server-my-tool

But platforms like Smithery.ai, which host and proxy MCP servers for thousands of users, need an HTTP endpoint they can call. The MCP specification defines two remote transports:

SSE (Server-Sent Events) — the older approach, being deprecated
Streamable HTTP — the new standard (March 2025 spec)

The Architecture Decision

I had three options:

Approach	Pros	Cons
Express + StreamableHTTPServerTransport	Full spec compliance, session management	Needs a persistent server ($$)
Vercel Serverless + JSON-RPC	Free hosting, auto-scaling	No SSE streaming, stateless
Docker on Railway/Fly.io	Full control	Requires credit card

I chose Vercel Serverless because it's free, auto-scales, and most MCP tool calls are simple request-response patterns that don't need streaming.

The Implementation

The key insight: MCP's Streamable HTTP transport sends JSON-RPC messages via POST. For simple tool calls, you don't need SSE — you can return the JSON-RPC response directly.

Here's the core handler:

// api/mcp.ts
import type { VercelRequest, VercelResponse } from '@vercel/node';

export default async function handler(req: VercelRequest, res: VercelResponse) {
  if (req.method === 'GET') {
    return res.json({ status: 'ok', transport: 'streamable-http' });
  }

  if (req.method === 'POST') {
    const { method, params, id } = req.body;

    switch (method) {
      case 'initialize':
        return res.json({
          jsonrpc: '2.0', id,
          result: {
            protocolVersion: '2025-03-26',
            capabilities: { tools: { listChanged: false } },
            serverInfo: { name: 'my-server', version: '1.0.0' }
          }
        });

      case 'tools/list':
        return res.json({
          jsonrpc: '2.0', id,
          result: { tools: MY_TOOLS }
        });

      case 'tools/call':
        const result = await executeToolCall(params);
        return res.json({
          jsonrpc: '2.0', id,
          result: { content: [{ type: 'text', text: JSON.stringify(result) }] }
        });

      case 'notifications/initialized':
        return res.status(202).end();
    }
  }
}

Session Management (or Lack Thereof)

The full MCP spec requires session management via Mcp-Session-Id headers. On Vercel Serverless, each invocation is stateless. My solution:

Generate a session ID on initialize
Return it in the response headers
Accept it on subsequent requests (but don't actually track state)

This works because my tools are stateless — each call is independent. If your tools need session state, you'll need a database or KV store.

Deploying to Vercel

{
  "rewrites": [{ "source": "/mcp", "destination": "/api/mcp" }],
  "functions": { "api/mcp.ts": { "maxDuration": 30 } }
}

vercel --yes --prod

That's it. Free HTTPS endpoint, auto-scaling, zero maintenance.

Testing

# Health check
curl https://your-server.vercel.app/api/mcp

# Initialize
curl -X POST https://your-server.vercel.app/api/mcp \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"initialize","id":1,"params":{"protocolVersion":"2025-03-26","capabilities":{},"clientInfo":{"name":"test","version":"1.0"}}}'

# Call a tool
curl -X POST https://your-server.vercel.app/api/mcp \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"tools/call","id":3,"params":{"name":"my_tool","arguments":{"query":"test"}}}'

Publishing to Smithery

Once your HTTP endpoint is live, publishing to Smithery is straightforward:

Go to smithery.ai/servers/new
Enter your server ID (e.g., username/server-name)
Paste your HTTP endpoint URL
Click Continue → Skip (if no connection params needed)

Smithery will probe your endpoint, detect your tools, and generate a quality score. Our server got all 5 tools detected immediately.

Lessons Learned

You don't need SSE for most use cases. Simple JSON-RPC request-response works fine for tool calls.
Vercel's free tier is surprisingly capable for MCP servers. The 30-second function timeout is more than enough.
Keep your tools stateless. It makes deployment dramatically simpler.
The MCP ecosystem is moving fast. The Streamable HTTP transport was only standardized in March 2025, and platforms are already adopting it.

What's Next

The MCP ecosystem needs better discovery. Right now, finding useful MCP servers means searching GitHub, checking npm, or browsing directories like Smithery and Glama. As the number of servers grows, we'll need something more structured — think npm for AI tools.

If you're building MCP servers and want to see how this works in practice, check out the SkillFlow MCP Server — it implements everything described here.

Have questions about MCP transports or serverless deployment? Drop a comment below.

DEV Community