Renato Marinho

Posted on Feb 27

I Tried to Deploy My MCP Server to Vercel. Here's What Actually Happened.

#mcp #mcpserver #ai #programming

I built a working MCP server. It connected to my database, returned tool results, and worked flawlessly in Claude Desktop locally.

Then I pushed to Vercel.

TypeError: Cannot read properties of undefined (reading 'addEventListener')

500 errors everywhere. The MCP adapter was trying to use persistent SSE connections inside ephemeral serverless functions. Everything broke — and it wasn't obvious why or how to fix it.

I wasn't alone. This is a known, documented problem across the community.

Why MCP and Serverless Don't Get Along

MCP was designed for long-lived processes. The original spec only supported two transports: stdio (local-only) and SSE (persistent server-sent events over HTTP). Both assume the server stays alive between calls.

Vercel Functions don't work that way. Each request can land on a different function instance. Memory is ephemeral. There's no persistent filesystem. And SSE connections stored in memory — poof, gone on the next cold start.

The result is a mess developers across Reddit, GitHub, and dev.to have been hitting for months:

SSE connections drop — The session lives in-memory on instance A. The next request hits instance B. Session not found.
autoDiscover() fails silently — It scans directories at boot. Vercel has no persistent filesystem.
Cold starts waste CPU — Zod reflection, schema generation, and Presenter compilation run from scratch on every cold invocation.
Transport bridge breaks — The official MCP SDK's StreamableHTTPServerTransport expects Node.js http.IncomingMessage. Vercel Edge Runtime uses Web Standard Request/Response. Manually bridging them is fragile and often breaks.
The adapter's disableSSE: true — Doesn't even exist as a property in ServerOptions. You're stuck.

The MCP protocol spec itself acknowledges this: statelessness and horizontal scaling are on the official roadmap as unresolved challenges. A GitHub discussion from the core team literally says: "I'm building a hosting platform for deploying MCPs and SSE makes it hard to scale remote MCPs because we can't use serverless."

This isn't a niche edge case. It's the default experience for anyone who tries to ship an MCP server to a modern deployment platform.

The Deeper Problem: Even When It Deploys, the AI Still Guesses

Let's say you do get past the deployment wall. You still have a second, subtler problem.

Most MCP servers today look like this:

handler: async ({ input }) => {
  return JSON.stringify(await db.query('SELECT * FROM invoices'));
}

Raw JSON. No context. No rules. No hints on what to do next.

The LLM receives amount_cents: 45000 and has to guess — is that dollars? Cents? Yen? It receives 3,000 invoice rows and burns your entire context window. It receives a stripe_payment_intent_id and a password_hash that were never meant to leave your database.

These aren't prompt engineering problems you solve with longer system prompts. They're architecture problems. The handler is doing too much, and the AI is receiving too little structure.

The two problems are related. Developers struggling to deploy their MCP server often never get far enough to realize the data layer is also broken.

MCP Fusion: A New Architecture for Agentic MCP Servers

MCP Fusion is an architecture layer built on top of the MCP SDK. It introduces the MVA pattern (Model-View-Agent) — a deliberate separation of three concerns that raw MCP servers collapse into one function.

Layer	Responsibility
Model	Handler returns raw data. Just a fetch or DB query. Nothing else.
View (Presenter)	Strips unauthorized fields, attaches domain rules, suggests next actions, truncates large collections.
Agent	Receives a Structured Perception Package — validated data + context + affordances — and acts deterministically.

The handler doesn't check auth. It doesn't filter fields. It doesn't format anything. It just returns data. The framework handles everything else through a deterministic pipeline:

contextFactory → middleware chain → Zod input validation
→ handler (raw data) → Presenter pipeline:
  1. agentLimit()      → truncate before context window overflow
  2. Zod schema        → strip undeclared fields (allowlist, not denylist)
  3. rules()           → attach contextual domain instructions
  4. suggestActions()  → HATEOAS affordances: valid next actions with pre-filled args
  5. uiBlocks()        → server-rendered charts/diagrams (ECharts, Mermaid)
→ Agent receives structured package

Here's what this looks like in practice:

import { initFusion, ui } from '@vinkius-core/mcp-fusion';
import { z } from 'zod';

interface AppContext { db: PrismaClient; user: { role: string; tenantId: string } }
const f = initFusion<AppContext>();

// Auth middleware — runs before EVERY tool that declares it
const auth = f.middleware(async (ctx) => {
  const payload = await verifyJWT((ctx as any).rawToken);
  const user = await prisma.user.findUniqueOrThrow({ where: { id: payload.sub } });
  return { db: prisma, user };
});

// Presenter — the perception layer
const InvoicePresenter = f.presenter({
  name: 'Invoice',
  schema: z.object({
    id: z.string(),
    customer: z.string(),
    amount_cents: z.number().describe('Amount in CENTS — divide by 100 for display'),
    status: z.enum(['draft', 'sent', 'paid', 'overdue']),
    // password_hash, stripe_secret, profit_margin → GONE before the wire
  }),
  rules: (inv) => [
    inv.status === 'overdue'
      ? 'Invoice is overdue. Send a reminder before any other action.'
      : null,
  ],
  suggestActions: (inv) => [
    inv.status === 'draft' ? { tool: 'billing.send', args: { id: inv.id } } : null,
    inv.status === 'overdue' ? { tool: 'billing.remind', args: { id: inv.id } } : null,
  ].filter(Boolean),
  agentLimit: { max: 50 }, // never send 3,000 rows to the LLM
});

// Tool — just fetch data. Everything else is handled.
const getInvoice = f.tool({
  name: 'billing.get',
  description: 'Retrieve an invoice by ID',
  input: z.object({ id: z.string() }),
  middleware: [auth],
  returns: InvoicePresenter, // ← one line wires the whole pipeline
  handler: async ({ input, ctx }) =>
    ctx.db.invoice.findUniqueOrThrow({ where: { id: input.id, tenantId: ctx.user.tenantId } }),
});

Notice: the handler is 3 lines. Auth, field stripping, domain rules, affordances, and truncation happen automatically — not sprinkled across the handler with if statements you forget to copy.

A database migration that adds a column doesn't change what the agent sees. New fields are invisible by default unless you explicitly declare them in the Presenter schema.

The Vercel Adapter: Solving the Deployment Wall

Now back to the original problem. You have a well-structured MCP server. How do you deploy it to Vercel without everything breaking?

MCP Fusion ships a dedicated Vercel Adapter that solves every serverless incompatibility at once:

npm install @vinkius-core/mcp-fusion-vercel

// app/api/mcp/route.ts — the ENTIRE file
import { initFusion } from '@vinkius-core/mcp-fusion';
import { vercelAdapter } from '@vinkius-core/mcp-fusion-vercel';
import { z } from 'zod';

interface AppContext { tenantId: string; dbUrl: string }
const f = initFusion<AppContext>();

const listProjects = f.tool({
  name: 'projects.list',
  input: z.object({ limit: z.number().optional().default(20) }),
  readOnly: true,
  handler: async ({ input, ctx }) =>
    fetch(`${ctx.dbUrl}/projects?tenant=${ctx.tenantId}&limit=${input.limit}`).then(r => r.json()),
});

const registry = f.registry();
registry.register(listProjects);

export const POST = vercelAdapter<AppContext>({
  registry,
  serverName: 'my-mcp-server',
  contextFactory: async (req) => ({
    tenantId: req.headers.get('x-tenant-id') || 'public',
    dbUrl: process.env.DATABASE_URL!,
  }),
});

// Optional: global edge network, ~0ms cold start
export const runtime = 'edge';

vercel deploy

Done. Live at https://your-project.vercel.app/api/mcp.

How It Fixes Each Problem

Serverless Problem	What the Adapter Does
SSE session loss	Uses `enableJsonResponse: true` — pure stateless JSON-RPC. No sessions, no SSE, no memory state to lose between instances.
No filesystem	Registry is built at module scope during cold start. No `autoDiscover()` needed — tools are registered explicitly.
Cold start CPU waste	Zod reflection, schema compilation, middleware resolution happen once and are cached. Warm requests only instantiate `McpServer` + `Transport`.
Transport incompatibility	Uses the MCP SDK's native `WebStandardStreamableHTTPServerTransport`, built for WinterCG runtimes — works on Edge and Node.js.
No `process.env` access	`contextFactory` receives the full `Request` object — headers, cookies, and all environment variables.

The adapter splits work into two phases deliberately:

COLD START (once per instance)
  → Zod reflection       → cached
  → Presenter compile    → cached
  → Schema generation    → cached
  → Middleware resolve   → cached

WARM REQUEST (per invocation)
  → new McpServer()      → ephemeral
  → contextFactory(req)  → per-request
  → JSON-RPC dispatch    → your handler runs
  → server.close()       → cleanup

Cold start pays the cost once. Every subsequent request is just routing + your business logic.

Edge vs. Node.js: Which Runtime?

	Edge	Node.js
Cold start	~0ms, global	~250ms, regional
APIs available	Web APIs only	Full Node.js + native modules
Max duration	5s free / 30s Pro	10s free / 60s Pro
Best for	Fast lookups, simple tools	Database queries, heavy computation

For tools querying Vercel Postgres or Vercel KV, use Node.js. For fast routing or API gateway tools, Edge is ideal. Switch with a single export line.

Native Vercel Services Work Out of the Box

import { sql } from '@vercel/postgres';
import { kv } from '@vercel/kv';

const getUser = f.tool({
  name: 'users.get',
  input: z.object({ id: z.string() }),
  readOnly: true,
  handler: async ({ input }) => {
    const { rows } = await sql`SELECT id, name, email FROM users WHERE id = ${input.id}`;
    return rows;
  },
});

const getCached = f.tool({
  name: 'cache.get',
  input: z.object({ key: z.string() }),
  readOnly: true,
  handler: async ({ input }) => ({ value: await kv.get(input.key) }),
});

No extra config. No adapter magic. Just import and use.

What's Fully Supported on Vercel

✅ Tools, groups, tags, exposition

✅ Middleware chains (auth, rate limiting, etc.)

✅ Presenters — field stripping, rules, affordances, agentLimit

✅ Governance Lockfile — pre-generated at build time

✅ Structured error recovery via toolError()

✅ Vercel Postgres, KV, and Blob

✅ Both Edge and Node.js runtimes

❌ autoDiscover() — no filesystem; register tools explicitly

❌ createDevServer() — use next dev or vercel dev

❌ State Sync notifications — stateless transport by design

Compatible With Any MCP Client

The stateless JSON-RPC endpoint works with everything:

Claude Desktop — direct HTTP config or via proxy
LangChain / LangGraph — HTTP transport
Vercel AI SDK — direct JSON-RPC
FusionClient — built-in type-safe client (tRPC-style)
Any custom agent — standard POST with JSON-RPC payload

The Takeaway

The reason deploying MCP servers to Vercel has been painful isn't a skill issue. The original protocol simply wasn't designed for stateless infrastructure. The combination of SSE transport, in-memory sessions, and filesystem assumptions made serverless deployment an exercise in workarounds — most of which don't work.

MCP Fusion resolves this at the framework level, not the patch level. The Vercel Adapter isn't a set of hacks; it's a first-class adapter that changes the transport model to match the deployment model. Pair it with the MVA architecture, and you stop writing MCP servers that make the AI guess — and start shipping servers that tell the AI exactly what to do next.

Links:

Have you hit this same wall trying to deploy MCP servers to serverless? What transport were you using? Drop a comment — the more real war stories, the better.

DEV Community