DEV Community

Rafael Silva
Rafael Silva

Posted on

Building a Streamable HTTP MCP Server: From stdio to Vercel Serverless

The Model Context Protocol (MCP) is rapidly becoming the standard way AI agents discover and use tools. But most MCP servers today use the stdio transport — they run locally and communicate through standard input/output. That's fine for desktop use, but what about cloud deployment?

In this post, I'll walk through how I migrated an MCP server from stdio to Streamable HTTP, deployed it on Vercel's free tier, and got it listed on Smithery.ai — all in a single afternoon.

The Problem: stdio Doesn't Scale

When you build an MCP server with stdio transport, it works great locally:

npx -y @anthropic/mcp-server-my-tool
Enter fullscreen mode Exit fullscreen mode

But platforms like Smithery.ai, which host and proxy MCP servers for thousands of users, need an HTTP endpoint they can call. The MCP specification defines two remote transports:

  1. SSE (Server-Sent Events) — the older approach, being deprecated
  2. Streamable HTTP — the new standard (March 2025 spec)

The Architecture Decision

I had three options:

Approach Pros Cons
Express + StreamableHTTPServerTransport Full spec compliance, session management Needs a persistent server ($$)
Vercel Serverless + JSON-RPC Free hosting, auto-scaling No SSE streaming, stateless
Docker on Railway/Fly.io Full control Requires credit card

I chose Vercel Serverless because it's free, auto-scales, and most MCP tool calls are simple request-response patterns that don't need streaming.

The Implementation

The key insight: MCP's Streamable HTTP transport sends JSON-RPC messages via POST. For simple tool calls, you don't need SSE — you can return the JSON-RPC response directly.

Here's the core handler:

// api/mcp.ts
import type { VercelRequest, VercelResponse } from '@vercel/node';

export default async function handler(req: VercelRequest, res: VercelResponse) {
  if (req.method === 'GET') {
    return res.json({ status: 'ok', transport: 'streamable-http' });
  }

  if (req.method === 'POST') {
    const { method, params, id } = req.body;

    switch (method) {
      case 'initialize':
        return res.json({
          jsonrpc: '2.0', id,
          result: {
            protocolVersion: '2025-03-26',
            capabilities: { tools: { listChanged: false } },
            serverInfo: { name: 'my-server', version: '1.0.0' }
          }
        });

      case 'tools/list':
        return res.json({
          jsonrpc: '2.0', id,
          result: { tools: MY_TOOLS }
        });

      case 'tools/call':
        const result = await executeToolCall(params);
        return res.json({
          jsonrpc: '2.0', id,
          result: { content: [{ type: 'text', text: JSON.stringify(result) }] }
        });

      case 'notifications/initialized':
        return res.status(202).end();
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Session Management (or Lack Thereof)

The full MCP spec requires session management via Mcp-Session-Id headers. On Vercel Serverless, each invocation is stateless. My solution:

  • Generate a session ID on initialize
  • Return it in the response headers
  • Accept it on subsequent requests (but don't actually track state)

This works because my tools are stateless — each call is independent. If your tools need session state, you'll need a database or KV store.

Deploying to Vercel

{
  "rewrites": [{ "source": "/mcp", "destination": "/api/mcp" }],
  "functions": { "api/mcp.ts": { "maxDuration": 30 } }
}
Enter fullscreen mode Exit fullscreen mode
vercel --yes --prod
Enter fullscreen mode Exit fullscreen mode

That's it. Free HTTPS endpoint, auto-scaling, zero maintenance.

Testing

# Health check
curl https://your-server.vercel.app/api/mcp

# Initialize
curl -X POST https://your-server.vercel.app/api/mcp \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"initialize","id":1,"params":{"protocolVersion":"2025-03-26","capabilities":{},"clientInfo":{"name":"test","version":"1.0"}}}'

# Call a tool
curl -X POST https://your-server.vercel.app/api/mcp \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"tools/call","id":3,"params":{"name":"my_tool","arguments":{"query":"test"}}}'
Enter fullscreen mode Exit fullscreen mode

Publishing to Smithery

Once your HTTP endpoint is live, publishing to Smithery is straightforward:

  1. Go to smithery.ai/servers/new
  2. Enter your server ID (e.g., username/server-name)
  3. Paste your HTTP endpoint URL
  4. Click Continue → Skip (if no connection params needed)

Smithery will probe your endpoint, detect your tools, and generate a quality score. Our server got all 5 tools detected immediately.

Lessons Learned

  1. You don't need SSE for most use cases. Simple JSON-RPC request-response works fine for tool calls.
  2. Vercel's free tier is surprisingly capable for MCP servers. The 30-second function timeout is more than enough.
  3. Keep your tools stateless. It makes deployment dramatically simpler.
  4. The MCP ecosystem is moving fast. The Streamable HTTP transport was only standardized in March 2025, and platforms are already adopting it.

What's Next

The MCP ecosystem needs better discovery. Right now, finding useful MCP servers means searching GitHub, checking npm, or browsing directories like Smithery and Glama. As the number of servers grows, we'll need something more structured — think npm for AI tools.

If you're building MCP servers and want to see how this works in practice, check out the SkillFlow MCP Server — it implements everything described here.


Have questions about MCP transports or serverless deployment? Drop a comment below.

Top comments (0)