Nico Acosta for BrainGrid

Posted on Jun 18 • Originally published at braingrid.ai

From Local Hack to Production-Ready: How We Solved the BrainGrid's MCP Multi-Tenant Authentication Problem

#ai #programming #webdev #softwaredevelopment

You've built an amazing MCP server. It works perfectly on your laptop. Your AI assistant can create Jira tickets, query your database, deploy to production - life is good. Then your teammate asks: "Hey, can I use this too?"

Even better, you want to ship your MCP as a product for your customers. You now need to support multiple tenants, each with their own API keys and authentication.

Suddenly, you're in hell.

The Problem Nobody Talks About

Here's what happens when you try to share your MCP server with your customers:

Option 1: The "Just Install It" Approach

## Your instructions to teammates:
1. Clone the repo
2. Install dependencies
3. Set up your API keys
4. Configure your environment
5. Run the server locally
6. Oh, and update these keys when they expire...
7. And don't forget to pull the latest changes...
8. BTW, it might conflict with your other Node versions...

Result: 3 hours later, half your customer gave up, the other half is debugging npm issues.

Option 2: The "Let's Host It" Nightmare

You deploy to a Serverless platform like Cloud Run or Vercel. Five minutes later:

customer: "It's asking me to authenticate... again"
you: "Yeah, just refresh and login again"
customer: "I just did. It's asking again."
you: "Oh, that's because Cloud Run scales to zero and..."
customer: "I don't care why. I just want to create a ticket."

The core issue? Serverless platforms don't do sessions. Every request could hit a different instance. Your carefully crafted auth flow becomes a game of authentication whack-a-mole.

Why This Matters More Than You Think

This isn't just an annoyance. It's the difference between:

A tool only you use vs A tool your entire customer base adopts
"Cool prototype" vs "Critical infrastructure"
Weekend project vs Production-ready product customers actually use

We learned this the hard way at BrainGrid. Our MCP server transformed how our team worked with AI - but only after we solved the authentication puzzle we were ready to ship to our customers.

What You'll Learn

This guide shows you exactly how we transformed our MCP server from a local development tool into a production-ready service that:

Authenticates once, works everywhere - No more login fatigue
Scales from 1 to 1000 users - Same performance whether it's just you or the whole company
Costs pennies to run - Efficient caching means minimal cloud costs
Works with existing auth - Integrates with WorkOS, Auth0, or any OAuth provider
Deploys in minutes - One command to go from local to remote

We'll cover the exact architecture, the gotchas we discovered, and the code that makes it all work. No theory, no fluff - just battle-tested solutions from our production deployment serving hundreds of developers.

Ready to make your MCP server something your customers will actually want to use? Let's dive in.

Initial Setup: From Local to Remote

Step 1: Basic MCP Server Configuration

Start with a standard MCP server setup using FastMCP. The key is understanding the dual nature of MCP servers - they need to work both locally for development and remotely for customers to use.

import { FastMCP } from 'fastmcp';
import { z } from 'zod';

// Define your tool schemas
const CreateRequirementSchema = z.object({
  message: z.string().describe("The requirement description"),
  repositories: z.string().optional().describe("Comma-separated list of repos")
});

const server = new FastMCP({
  name: 'braingrid-server',
  version: '1.0.0'
});

// Add your tools
server.addTool({
  name: 'create_requirement',
  description: 'Create a new requirement in BrainGrid',
  parameters: CreateRequirementSchema,
  execute: async (args, context) => {
    // Tool implementation
    // Note: context.session contains user auth info when hosted
    const apiClient = new BrainGridApiClient(config, context?.session);
    return await apiClient.createRequirement(args);
  }
});

// Local development (stdio transport)
await server.start({ transportType: 'stdio' });

Step 2: Switching to httpStream for Remote Hosting

To deploy on Cloud Run or Vercel, switch to httpStream transport. This requires careful consideration of how your tools will handle authentication:

// Detect transport type from environment
const transportType = process.env.MCP_TRANSPORT || 'stdio';

// httpStream configuration for serverless
if (transportType === 'httpStream') {
  await server.start({
    transportType: 'httpStream',
    httpStream: {
      port: parseInt(process.env.PORT || '8080'),
      endpoint: '/mcp'
    }
  });
} else {
  // Local stdio transport
  await server.start({ transportType: 'stdio' });
}

Step 3: Implementing OAuth with WorkOS

MCP requires specific OAuth implementation patterns. The key insight is that MCP clients expect a particular discovery flow:

const serverOptions = {
  name: 'braingrid-server',
  version: '1.0.0',
  authenticate: authenticateRequest,
  oauth: {
    enabled: true,
    protectedResource: {
      resource: 'https://mcp.braingrid.ai',
      authorizationServers: ['https://auth.workos.com'],
      bearerMethodsSupported: ['header'],
    },
    // This is crucial for MCP client compatibility
    authorizationServer: {
      issuer: 'https://auth.workos.com',
      authorizationEndpoint: 'https://auth.workos.com/oauth2/authorize',
      tokenEndpoint: 'https://auth.workos.com/oauth2/token',
      jwksUri: 'https://auth.workos.com/oauth2/jwks', // Note: Not /.well-known/jwks.json
      responseTypesSupported: ['code'],
      grantTypesSupported: ['authorization_code', 'refresh_token'],
      codeChallengeMethodsSupported: ['S256'],
      tokenEndpointAuthMethodsSupported: ['none'],
      scopesSupported: ['email', 'offline_access', 'openid', 'profile'],
    }
  }
};

Key implementation detail: The WWW-Authenticate header must be properly formatted for MCP clients:

// MCP session structure - what gets passed to your tools
interface MCPSession {
  userId: string;
  email: string;
  organizationId: string;
  scopes: string[];
  token: string;
}

export async function authenticateRequest(request: IncomingMessage): Promise<MCPSession> {
  const authHeader = request.headers.authorization;

  if (!authHeader) {
    // MCP clients expect this specific format
    throw new Response(null, {
      status: 401,
      headers: {
        'WWW-Authenticate': 'Bearer error="unauthorized", ' +
          'error_description="Authorization needed", ' +
          'resource_metadata="https://mcp.braingrid.ai/.well-known/oauth-protected-resource"'
      }
    });
  }

  // Extract bearer token
  const bearerMatch = authHeader.match(/^Bearer (.+)$/);
  if (!bearerMatch) {
    throw new Response(null, {
      status: 401,
      headers: {
        'WWW-Authenticate': 'Bearer error="invalid_token", ' +
          'error_description="Invalid authorization header format"'
      }
    });
  }

  const token = bearerMatch[1];
  // Validate JWT and return session
  return await validateAndCreateSession(token);
}

Step 4: Handling Dual Transport Modes

Your MCP server needs to support both local and remote authentication patterns:

export class BrainGridApiClient {
  private auth?: AuthHandler;
  private session?: MCPSession;
  private readonly config: { apiUrl: string; organizationId?: string };

  constructor(config: { apiUrl: string; organizationId?: string }, session?: MCPSession) {
    this.config = config;
    this.session = session;

    // Only create AuthHandler for local mode
    if (!session) {
      this.auth = new AuthHandler(config);
    }
  }

  private async getHeaders(): Promise<Record<string, string>> {
    if (this.session) {
      // Remote mode - use session token
      return {
        'Authorization': `Bearer ${this.session.token}`,
        'X-Organization-Id': this.session.organizationId,
        'Content-Type': 'application/json',
      };
    } else if (this.auth) {
      // Local mode - use stored auth
      return this.auth.getOrganizationHeaders();
    }
    throw new Error('No authentication method available');
  }
}

The Serverless Challenge

Serverless platforms like Cloud Run and Vercel share fundamental characteristics that create unique challenges for stateful applications:

1. Instance Lifecycle Management

Serverless instances have unpredictable lifecycles:

Cold starts: New instances spin up on demand
Scale to zero: Instances terminate after inactivity
Horizontal scaling: Multiple instances serve concurrent requests
No sticky sessions: Requests can hit any instance

This creates specific challenges for MCP servers:

// This approach fails in serverless:
class NaiveMCPServer {
  private sessions = new Map<string, MCPSession>(); // ❌ Lost on instance restart

  async authenticate(token: string): Promise<MCPSession> {
    // Check memory cache
    if (this.sessions.has(token)) {
      return this.sessions.get(token)!;
    }

    // Validate and cache
    const session = await validateJWT(token);
    this.sessions.set(token, session); // ❌ Only exists on this instance
    return session;
  }
}

2. JWT Validation Overhead

Without session persistence, your MCP server performs full JWT validation on every request:

async function validateJWT(token: string): Promise<MCPSession> {
  // Step 1: Fetch JWKS (Network call ~50ms)
  const jwks = await fetchJWKS('https://auth.workos.com/oauth2/jwks');

  // Step 2: Verify signature (CPU intensive ~10ms)
  const verified = await jose.jwtVerify(token, jwks);

  // Step 3: Check claims (CPU ~5ms)
  if (verified.payload.iss !== 'https://auth.workos.com') {
    throw new Error('Invalid issuer');
  }

  // Step 4: Extract session data
  return {
    userId: verified.payload.sub,
    email: verified.payload.email,
    organizationId: verified.payload.org_id,
    scopes: verified.payload.scopes,
    token: token
  };
}

This adds 50-100ms to every request and increases costs significantly.

3. Re-authentication Fatigue

The user experience without session persistence:

Timeline of a frustrated developer:
0:00 - Connect to MCP server ✓
0:01 - Authenticate via WorkOS ✓
0:02 - Create requirement ✓
0:05 - (Cloud Run scales instance to zero)
0:10 - Try to update task ✗ "Please authenticate again"
0:11 - Re-authenticate 😤
0:12 - Update task ✓
0:15 - (New instance due to load)
0:16 - Try to commit ✗ "Please authenticate again"
0:17 - Rage quit

Technical Solution: Redis Session Store with Encryption

Architecture Overview

The solution implements a multi-tier caching strategy with security at its core:

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Request   │────▶│   Memory    │────▶│    Redis    │
│             │     │   Cache     │     │    Cache    │
└─────────────┘     └─────────────┘     └─────────────┘
                            │                    │
                            ▼                    ▼
                    ┌─────────────┐     ┌─────────────┐
                    │     JWT     │     │     JWT     │
                    │ Validation  │     │ Validation  │
                    └─────────────┘     └─────────────┘

Implementation Details

Session Store with AES-256-GCM Encryption

The session store implements military-grade encryption for sensitive session data:

import { Redis } from 'ioredis';
import crypto from 'crypto';
import { MCPSession } from './types.js';
import { logger } from './logger.js';

export class SessionStore {
  private redis: Redis | null = null;
  private encryptionKey: Buffer | null = null;
  private algorithm = 'aes-256-gcm';
  private readonly ttl: number;
  private keyPrefix = 'mcp:session:';

  constructor() {
    // Only initialize for httpStream transport
    if (process.env.MCP_TRANSPORT !== 'httpStream') {
      logger.debug('SessionStore not initialized - stdio transport');
      return;
    }

    const redisUrl = process.env.REDIS_URL;
    const encryptionKeyHex = process.env.ENCRYPTION_KEY;

    if (!redisUrl || !encryptionKeyHex) {
      logger.warn('Session persistence disabled - missing configuration');
      return;
    }

    // Validate encryption key length
    if (encryptionKeyHex.length !== 64) {
      throw new Error('ENCRYPTION_KEY must be 32 bytes (64 hex characters)');
    }

    this.encryptionKey = Buffer.from(encryptionKeyHex, 'hex');
    this.ttl = parseInt(process.env.SESSION_CACHE_TTL || '604800', 10);

    // Configure Redis with production-ready settings
    this.redis = new Redis(redisUrl, {
      // Retry strategy with exponential backoff
      retryStrategy: (times) => {
        const delay = Math.min(times * 50, 2000);
        logger.debug(`Redis retry attempt ${times}, delay: ${delay}ms`);
        return delay;
      },
      // Reconnect on READONLY errors (Redis failover)
      reconnectOnError: (err) => {
        const shouldReconnect = err.message.includes('READONLY');
        if (shouldReconnect) {
          logger.warn('Redis READONLY error, reconnecting...');
        }
        return shouldReconnect;
      },
      // Connection settings
      connectTimeout: 10000,
      maxRetriesPerRequest: 3,
      enableReadyCheck: true,
      enableOfflineQueue: false, // Fail fast in production
    });

    // Monitor Redis connection health
    this.redis.on('connect', () => logger.info('Redis connected'));
    this.redis.on('ready', () => logger.info('Redis ready'));
    this.redis.on('error', (err) => logger.error({ err }, 'Redis error'));
    this.redis.on('close', () => logger.warn('Redis connection closed'));
  }

  /**
   * Check if session store is available
   */
  isAvailable(): boolean {
    return this.redis !== null &&
           this.redis.status === 'ready' &&
           this.encryptionKey !== null;
  }

  /**
   * Store encrypted session with automatic expiration
   */
  async storeSession(session: MCPSession): Promise<void> {
    if (!this.isAvailable()) {
      logger.debug('Session store unavailable, skipping storage');
      return;
    }

    try {
      // Generate unique IV for each encryption
      const iv = crypto.randomBytes(16);
      const cipher = crypto.createCipheriv(this.algorithm, this.encryptionKey!, iv);

      // Encrypt session data
      const sessionJson = JSON.stringify(session);
      const encrypted = Buffer.concat([
        cipher.update(sessionJson, 'utf8'),
        cipher.final()
      ]);

      // Get authentication tag for GCM
      const authTag = cipher.getAuthTag();

      // Combine components: IV (16) + AuthTag (16) + Encrypted Data
      const combined = Buffer.concat([iv, authTag, encrypted]);
      const encoded = combined.toString('base64');

      // Store with TTL
      const key = `${this.keyPrefix}${session.userId}`;
      await this.redis!.setex(key, this.ttl, encoded);

      logger.debug({
        userId: session.userId,
        keySize: encoded.length,
        ttl: this.ttl
      }, 'Session stored successfully');
    } catch (error) {
      logger.error({ error }, 'Failed to store session');
      // Don't throw - graceful degradation
    }
  }

  /**
   * Retrieve and decrypt session
   */
  async getSession(userId: string): Promise<MCPSession | null> {
    if (!this.isAvailable()) {
      return null;
    }

    const startTime = Date.now();
    try {
      const key = `${this.keyPrefix}${userId}`;
      const encoded = await this.redis!.get(key);

      if (!encoded) {
        logger.debug({ userId }, 'Session not found in cache');
        return null;
      }

      // Decode and extract components
      const combined = Buffer.from(encoded, 'base64');
      const iv = combined.slice(0, 16);
      const authTag = combined.slice(16, 32);
      const encrypted = combined.slice(32);

      // Decrypt with authentication
      const decipher = crypto.createDecipheriv(this.algorithm, this.encryptionKey!, iv);
      decipher.setAuthTag(authTag);

      const decrypted = Buffer.concat([
        decipher.update(encrypted),
        decipher.final()
      ]);

      const session = JSON.parse(decrypted.toString('utf8')) as MCPSession;

      const elapsed = Date.now() - startTime;
      logger.debug({ userId, elapsed }, 'Session retrieved from cache');

      return session;
    } catch (error) {
      if (error instanceof Error && error.message.includes('Unsupported state or unable to authenticate data')) {
        logger.error({ userId }, 'Session decryption failed - possible tampering');
      } else {
        logger.error({ error, userId }, 'Failed to retrieve session');
      }
      return null;
    }
  }

  /**
   * Remove session (for logout)
   */
  async removeSession(userId: string): Promise<void> {
    if (!this.isAvailable()) return;

    try {
      const key = `${this.keyPrefix}${userId}`;
      await this.redis!.del(key);
      logger.debug({ userId }, 'Session removed');
    } catch (error) {
      logger.error({ error, userId }, 'Failed to remove session');
    }
  }

  /**
   * Clean shutdown
   */
  async close(): Promise<void> {
    if (this.redis) {
      await this.redis.quit();
      this.redis = null;
    }
  }
}

// Singleton instance
export const sessionStore = new SessionStore();

Optimized Authentication Middleware

The authentication middleware implements a fast-path/slow-path pattern:

import { IncomingMessage } from 'http';
import crypto from 'crypto';
import { decodeJwt, createRemoteJWKSet, jwtVerify, JWTPayload } from 'jose';
import { sessionStore } from './session-store.js';
import { logger } from './logger.js';
import { MCPSession } from './types.js';

export async function authenticateRequest(request: IncomingMessage): Promise<MCPSession> {
  const requestId = crypto.randomUUID();
  const startTime = Date.now();

  logger.debug({
    requestId,
    method: request.method,
    url: request.url
  }, 'Authentication request started');

  try {
    // Extract bearer token
    const token = extractBearerToken(request);
    if (!token) {
      throw new UnauthorizedError('No bearer token provided');
    }

    // Fast path: Try to decode JWT for userId
    let userId: string | null = null;
    let tokenExp: number | null = null;

    try {
      const decoded = decodeJwt(token);
      userId = decoded.sub || null;
      tokenExp = decoded.exp || null;

      // Quick expiration check
      if (tokenExp && tokenExp < Date.now() / 1000) {
        logger.debug({ requestId, userId }, 'Token expired, skipping cache');
        userId = null; // Force validation
      }
    } catch (error) {
      logger.debug({ requestId }, 'Failed to decode JWT for cache lookup');
    }

    // Try cache if we have a userId
    if (userId && sessionStore.isAvailable()) {
      const cached = await sessionStore.getSession(userId);
      if (cached && cached.token === token) {
        const elapsed = Date.now() - startTime;
        logger.info({
          requestId,
          userId,
          elapsed,
          source: 'cache'
        }, 'Authentication successful (cached)');

        return cached;
      }
    }

    // Slow path: Full JWT validation
    logger.debug({ requestId }, 'Cache miss, performing JWT validation');
    const session = await validateJWTWithWorkOS(token);

    // Store for next time
    if (sessionStore.isAvailable()) {
      await sessionStore.storeSession(session);
    }

    const elapsed = Date.now() - startTime;
    logger.info({
      requestId,
      userId: session.userId,
      elapsed,
      source: 'jwt'
    }, 'Authentication successful (validated)');

    return session;
  } catch (error) {
    const elapsed = Date.now() - startTime;
    logger.error({
      requestId,
      error: error instanceof Error ? error.message : 'Unknown error',
      elapsed
    }, 'Authentication failed');

    // Return proper HTTP response for MCP
    if (error instanceof UnauthorizedError) {
      throw new Response(null, {
        status: 401,
        headers: {
          'WWW-Authenticate': `Bearer error="unauthorized", ` +
            `error_description="${error.message}", ` +
            `resource_metadata="${getResourceMetadataUrl()}"`
        }
      });
    }

    throw error;
  }
}

function extractBearerToken(request: IncomingMessage): string | null {
  const authHeader = request.headers.authorization;
  if (!authHeader) return null;

  const match = authHeader.match(/^Bearer (.+)$/);
  return match ? match[1] : null;
}

class UnauthorizedError extends Error {
  constructor(message: string) {
    super(message);
    this.name = 'UnauthorizedError';
  }
}

function getResourceMetadataUrl(): string {
  const host = process.env.MCP_HOST || 'https://mcp.braingrid.ai';
  return `${host}/.well-known/oauth-protected-resource`;
}

// JWT validation with WorkOS
const jwksCache = new Map<string, ReturnType<typeof createRemoteJWKSet>>();

async function validateJWTWithWorkOS(token: string): Promise<MCPSession> {
  const issuer = process.env.WORKOS_ISSUER || 'https://auth.workos.com';

  try {
    // Get or create JWKS
    let jwks = jwksCache.get(issuer);
    if (!jwks) {
      jwks = createRemoteJWKSet(new URL(`${issuer}/oauth2/jwks`));
      jwksCache.set(issuer, jwks);
    }

    // Verify JWT with options
    const verifyOptions: any = {
      issuer,
      algorithms: ['RS256'],
    };

    // Only check audience if configured
    if (process.env.WORKOS_CLIENT_ID) {
      verifyOptions.audience = process.env.WORKOS_CLIENT_ID;
    }

    const { payload } = await jwtVerify(token, jwks, verifyOptions);

    // Validate required claims
    if (!payload.sub || !payload.email || !payload.org_id) {
      throw new Error('Missing required JWT claims');
    }

    // Create session from JWT claims
    return {
      userId: payload.sub,
      email: payload.email as string,
      organizationId: payload.org_id as string,
      scopes: Array.isArray(payload.scopes) ? payload.scopes : [],
      token,
    };
  } catch (error) {
    logger.error({ error: error instanceof Error ? error.message : 'Unknown error' }, 'JWT validation failed');
    throw error;
  }
}

Graceful Degradation

The implementation handles Redis failures gracefully by simply returning null and forcing re-authentication. This is intentional - in a serverless environment, there's no point in falling back to in-memory caching since each instance has its own memory. Better to fail fast and have the user re-authenticate than to create inconsistent state.

Production Deployment Strategies

Cloud Run Configuration

Create a comprehensive deployment configuration:

## Multi-stage build for optimization
FROM node:22-alpine AS builder

WORKDIR /app

## Copy package files
COPY package*.json ./
COPY pnpm-lock.yaml ./

## Install dependencies
RUN npm install -g pnpm && pnpm install --frozen-lockfile

## Copy source code
COPY . .

## Build TypeScript
RUN pnpm run build

## Production stage
FROM node:22-alpine

WORKDIR /app

## Install production dependencies only
COPY package*.json ./
COPY pnpm-lock.yaml ./
RUN npm install -g pnpm && pnpm install --prod --frozen-lockfile

## Copy built application
COPY --from=builder /app/dist ./dist

## Set environment
ENV NODE_ENV=production
ENV MCP_TRANSPORT=httpStream

## Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
  CMD node -e "require('http').get('http://localhost:8080/health', (r) => process.exit(r.statusCode === 200 ? 0 : 1))"

EXPOSE 8080

CMD ["node", "dist/server.js"]

Deploy with proper configuration:

#!/bin/bash
## deploy-cloud-run.sh

PROJECT_ID="your-project-id"
SERVICE_NAME="braingrid-mcp-server"
REGION="us-central1"
REDIS_URL="rediss://<your-redis-instance>"

## Build and push image
gcloud builds submit --tag gcr.io/${PROJECT_ID}/${SERVICE_NAME}

## Deploy to Cloud Run
gcloud run deploy ${SERVICE_NAME} \
  --image gcr.io/${PROJECT_ID}/${SERVICE_NAME} \
  --platform managed \
  --region ${REGION} \
  --allow-unauthenticated \
  --set-env-vars "MCP_TRANSPORT=httpStream" \
  --set-env-vars "BRAINGRID_ENV=production" \
  --set-env-vars "REDIS_URL=${REDIS_URL}" \
  --set-secrets "ENCRYPTION_KEY=mcp-encryption-key:latest" \
  --cpu 1 \
  --memory 512Mi \
  --min-instances 1 \
  --max-instances 100 \
  --concurrency 80 \
  --timeout 300

Vercel Configuration

For Vercel deployment, create vercel.json:

{
  "version": 2,
  "builds": [
    {
      "src": "dist/server.js",
      "use": "@vercel/node"
    }
  ],
  "routes": [
    {
      "src": "/health",
      "dest": "/dist/server.js"
    },
    {
      "src": "/mcp",
      "dest": "/dist/server.js"
    },
    {
      "src": "/.well-known/oauth-protected-resource",
      "dest": "/dist/server.js"
    }
  ],
  "env": {
    "MCP_TRANSPORT": "httpStream",
    "NODE_ENV": "production"
  }
}

Monitoring and Debugging

Structured Logging

Implement comprehensive logging for production debugging:

import pino from 'pino';

// Configure structured logging
export const logger = pino({
  level: process.env.LOG_LEVEL || 'info',
  transport: process.env.NODE_ENV === 'production' ? undefined : {
    target: 'pino-pretty',
    options: {
      colorize: true,
      translateTime: 'HH:MM:ss Z',
      ignore: 'pid,hostname'
    }
  },
  formatters: {
    level: (label) => {
      return { level: label };
    }
  },
  serializers: {
    req: (req) => ({
      method: req.method,
      url: req.url,
      headers: {
        ...req.headers,
        authorization: req.headers.authorization ? '[REDACTED]' : undefined
      }
    }),
    err: pino.stdSerializers.err
  }
});

// Request tracking middleware
import { IncomingMessage, ServerResponse } from 'http';

export function requestLogging() {
  return (req: IncomingMessage, res: ServerResponse, next: () => void) => {
    const start = Date.now();
    const requestId = crypto.randomUUID();

    // Attach to request
    (req as any).requestId = requestId;

    // Log request
    logger.info({
      requestId,
      req,
      type: 'request'
    }, 'Incoming request');

    // Log response
    res.on('finish', () => {
      const elapsed = Date.now() - start;
      logger.info({
        requestId,
        statusCode: res.statusCode,
        elapsed,
        type: 'response'
      }, 'Request completed');
    });

    next();
  };
}

Metrics Collection

For production deployments, export metrics to your observability platform:

// Example: Exporting MCP tool call metrics to DataDog
import { StatsD } from 'node-dogstatsd';

const dogstatsd = new StatsD({
  host: process.env.DD_AGENT_HOST || 'localhost',
  port: 8125,
  prefix: 'mcp.server.',
  tags: [`env:${process.env.BRAINGRID_ENV || 'development'}`]
});

// Track tool usage
export function recordToolCall(toolName: string, duration: number, success: boolean) {
  // Record timing metric
  dogstatsd.timing('tool.call.duration', duration, [
    `tool:${toolName}`,
    `status:${success ? 'success' : 'failure'}`
  ]);

  // Increment counter
  dogstatsd.increment('tool.call.count', 1, [
    `tool:${toolName}`,
    `status:${success ? 'success' : 'failure'}`
  ]);
}

// In your tool implementation:
server.addTool({
  name: 'create_requirement',
  execute: async (args, context) => {
    const startTime = Date.now();
    try {
      const result = await apiClient.createRequirement(args);
      recordToolCall('create_requirement', Date.now() - startTime, true);
      return result;
    } catch (error) {
      recordToolCall('create_requirement', Date.now() - startTime, false);
      throw error;
    }
  }
});

Performance Optimization

Connection Pooling

Optimize Redis connections for serverless:

// Redis connection pool for serverless
export class RedisConnectionPool {
  private static instance: Redis | null = null;

  static getInstance(): Redis | null {
    if (!this.instance && process.env.REDIS_URL) {
      this.instance = new Redis(process.env.REDIS_URL, {
        // Connection pool settings
        maxRetriesPerRequest: 3,
        enableReadyCheck: true,
        lazyConnect: true, // Important for serverless

        // Serverless-optimized timeouts
        connectTimeout: 5000,
        commandTimeout: 5000,

        // Connection reuse
        keepAlive: 30000,
        noDelay: true,

        // Handle connection errors gracefully
        retryStrategy: (times) => {
          if (times > 3) return null; // Stop retrying
          return Math.min(times * 100, 3000);
        }
      });

      // Ensure connection is established
      this.instance.connect().catch((err: Error) => {
        logger.error({ err: err.message }, 'Redis connection failed');
        this.instance = null;
      });
    }

    return this.instance;
  }

  static async close(): Promise<void> {
    if (this.instance) {
      await this.instance.quit();
      this.instance = null;
    }
  }
}

Request Batching

Optimize for concurrent requests:

export class BatchedJWTValidator {
  private readonly pendingValidations = new Map<string, Promise<MCPSession>>();

  async validateToken(token: string): Promise<MCPSession> {
    // Check if validation is already in progress
    if (this.pendingValidations.has(token)) {
      logger.debug('Reusing pending validation');
      return this.pendingValidations.get(token)!;
    }

    // Start new validation
    const validationPromise = this.performValidation(token)
      .finally(() => {
        // Clean up after completion
        this.pendingValidations.delete(token);
      });

    this.pendingValidations.set(token, validationPromise);
    return validationPromise;
  }

  private async performValidation(token: string): Promise<MCPSession> {
    // Actual JWT validation logic
    return validateJWTWithWorkOS(token);
  }
}

Conclusion

Hosting MCP servers in serverless environments is challenging, but the patterns we've covered make it possible to build production-ready solutions that scale.

The key technical takeaways:

Session persistence is non-negotiable - Without Redis or similar external storage, your users face constant re-authentication
Security can't be an afterthought - Proper encryption (AES-256-GCM) and secure token handling are essential
Fast-path optimization matters - JWT validation is expensive; caching authenticated sessions dramatically improves performance
Graceful degradation over complex fallbacks - When Redis fails, force re-authentication rather than trying clever in-memory solutions
Observable systems are debuggable systems - Export metrics to DataDog or your platform of choice

By solving these challenges, we transformed our MCP server from a local development tool into infrastructure that our entire team relies on. The same patterns apply whether you're building tools for internal use or creating MCP servers for the broader community.

The future of development involves AI assistants that understand context and can take meaningful actions. Making that future accessible to teams - not just individual developers - requires solving the infrastructure challenges we've outlined here.

Originally published on the BrainGrid blog.

DEV Community