You've built an amazing MCP server. It works perfectly on your laptop. Your AI assistant can create Jira tickets, query your database, deploy to production - life is good. Then your teammate asks: "Hey, can I use this too?"
Even better, you want to ship your MCP as a product for your customers. You now need to support multiple tenants, each with their own API keys and authentication.
Suddenly, you're in hell.
The Problem Nobody Talks About
Here's what happens when you try to share your MCP server with your customers:
Option 1: The "Just Install It" Approach
## Your instructions to teammates:
1. Clone the repo
2. Install dependencies
3. Set up your API keys
4. Configure your environment
5. Run the server locally
6. Oh, and update these keys when they expire...
7. And don't forget to pull the latest changes...
8. BTW, it might conflict with your other Node versions...
Result: 3 hours later, half your customer gave up, the other half is debugging npm issues.
Option 2: The "Let's Host It" Nightmare
You deploy to a Serverless platform like Cloud Run or Vercel. Five minutes later:
customer: "It's asking me to authenticate... again"
you: "Yeah, just refresh and login again"
customer: "I just did. It's asking again."
you: "Oh, that's because Cloud Run scales to zero and..."
customer: "I don't care why. I just want to create a ticket."
The core issue? Serverless platforms don't do sessions. Every request could hit a different instance. Your carefully crafted auth flow becomes a game of authentication whack-a-mole.
Why This Matters More Than You Think
This isn't just an annoyance. It's the difference between:
- A tool only you use vs A tool your entire customer base adopts
- "Cool prototype" vs "Critical infrastructure"
- Weekend project vs Production-ready product customers actually use
We learned this the hard way at BrainGrid. Our MCP server transformed how our team worked with AI - but only after we solved the authentication puzzle we were ready to ship to our customers.
What You'll Learn
This guide shows you exactly how we transformed our MCP server from a local development tool into a production-ready service that:
- Authenticates once, works everywhere - No more login fatigue
- Scales from 1 to 1000 users - Same performance whether it's just you or the whole company
- Costs pennies to run - Efficient caching means minimal cloud costs
- Works with existing auth - Integrates with WorkOS, Auth0, or any OAuth provider
- Deploys in minutes - One command to go from local to remote
We'll cover the exact architecture, the gotchas we discovered, and the code that makes it all work. No theory, no fluff - just battle-tested solutions from our production deployment serving hundreds of developers.
Ready to make your MCP server something your customers will actually want to use? Let's dive in.
How we got there
- Initial Setup: From Local to Remote
- The Serverless Challenge
- Technical Solution: Redis Session Store
- Production Deployment Strategies
- Monitoring and Debugging
- Performance Optimization
- The Paradigm Shift
Initial Setup: From Local to Remote
Step 1: Basic MCP Server Configuration
Start with a standard MCP server setup using FastMCP. The key is understanding the dual nature of MCP servers - they need to work both locally for development and remotely for customers to use.
import { FastMCP } from 'fastmcp';
import { z } from 'zod';
// Define your tool schemas
const CreateRequirementSchema = z.object({
message: z.string().describe("The requirement description"),
repositories: z.string().optional().describe("Comma-separated list of repos")
});
const server = new FastMCP({
name: 'braingrid-server',
version: '1.0.0'
});
// Add your tools
server.addTool({
name: 'create_requirement',
description: 'Create a new requirement in BrainGrid',
parameters: CreateRequirementSchema,
execute: async (args, context) => {
// Tool implementation
// Note: context.session contains user auth info when hosted
const apiClient = new BrainGridApiClient(config, context?.session);
return await apiClient.createRequirement(args);
}
});
// Local development (stdio transport)
await server.start({ transportType: 'stdio' });
Step 2: Switching to httpStream for Remote Hosting
To deploy on Cloud Run or Vercel, switch to httpStream transport. This requires careful consideration of how your tools will handle authentication:
// Detect transport type from environment
const transportType = process.env.MCP_TRANSPORT || 'stdio';
// httpStream configuration for serverless
if (transportType === 'httpStream') {
await server.start({
transportType: 'httpStream',
httpStream: {
port: parseInt(process.env.PORT || '8080'),
endpoint: '/mcp'
}
});
} else {
// Local stdio transport
await server.start({ transportType: 'stdio' });
}
Step 3: Implementing OAuth with WorkOS
MCP requires specific OAuth implementation patterns. The key insight is that MCP clients expect a particular discovery flow:
const serverOptions = {
name: 'braingrid-server',
version: '1.0.0',
authenticate: authenticateRequest,
oauth: {
enabled: true,
protectedResource: {
resource: 'https://mcp.braingrid.ai',
authorizationServers: ['https://auth.workos.com'],
bearerMethodsSupported: ['header'],
},
// This is crucial for MCP client compatibility
authorizationServer: {
issuer: 'https://auth.workos.com',
authorizationEndpoint: 'https://auth.workos.com/oauth2/authorize',
tokenEndpoint: 'https://auth.workos.com/oauth2/token',
jwksUri: 'https://auth.workos.com/oauth2/jwks', // Note: Not /.well-known/jwks.json
responseTypesSupported: ['code'],
grantTypesSupported: ['authorization_code', 'refresh_token'],
codeChallengeMethodsSupported: ['S256'],
tokenEndpointAuthMethodsSupported: ['none'],
scopesSupported: ['email', 'offline_access', 'openid', 'profile'],
}
}
};
Key implementation detail: The WWW-Authenticate header must be properly formatted for MCP clients:
// MCP session structure - what gets passed to your tools
interface MCPSession {
userId: string;
email: string;
organizationId: string;
scopes: string[];
token: string;
}
export async function authenticateRequest(request: IncomingMessage): Promise<MCPSession> {
const authHeader = request.headers.authorization;
if (!authHeader) {
// MCP clients expect this specific format
throw new Response(null, {
status: 401,
headers: {
'WWW-Authenticate': 'Bearer error="unauthorized", ' +
'error_description="Authorization needed", ' +
'resource_metadata="https://mcp.braingrid.ai/.well-known/oauth-protected-resource"'
}
});
}
// Extract bearer token
const bearerMatch = authHeader.match(/^Bearer (.+)$/);
if (!bearerMatch) {
throw new Response(null, {
status: 401,
headers: {
'WWW-Authenticate': 'Bearer error="invalid_token", ' +
'error_description="Invalid authorization header format"'
}
});
}
const token = bearerMatch[1];
// Validate JWT and return session
return await validateAndCreateSession(token);
}
Step 4: Handling Dual Transport Modes
Your MCP server needs to support both local and remote authentication patterns:
export class BrainGridApiClient {
private auth?: AuthHandler;
private session?: MCPSession;
private readonly config: { apiUrl: string; organizationId?: string };
constructor(config: { apiUrl: string; organizationId?: string }, session?: MCPSession) {
this.config = config;
this.session = session;
// Only create AuthHandler for local mode
if (!session) {
this.auth = new AuthHandler(config);
}
}
private async getHeaders(): Promise<Record<string, string>> {
if (this.session) {
// Remote mode - use session token
return {
'Authorization': `Bearer ${this.session.token}`,
'X-Organization-Id': this.session.organizationId,
'Content-Type': 'application/json',
};
} else if (this.auth) {
// Local mode - use stored auth
return this.auth.getOrganizationHeaders();
}
throw new Error('No authentication method available');
}
}
The Serverless Challenge
Serverless platforms like Cloud Run and Vercel share fundamental characteristics that create unique challenges for stateful applications:
1. Instance Lifecycle Management
Serverless instances have unpredictable lifecycles:
- Cold starts: New instances spin up on demand
- Scale to zero: Instances terminate after inactivity
- Horizontal scaling: Multiple instances serve concurrent requests
- No sticky sessions: Requests can hit any instance
This creates specific challenges for MCP servers:
// This approach fails in serverless:
class NaiveMCPServer {
private sessions = new Map<string, MCPSession>(); // ❌ Lost on instance restart
async authenticate(token: string): Promise<MCPSession> {
// Check memory cache
if (this.sessions.has(token)) {
return this.sessions.get(token)!;
}
// Validate and cache
const session = await validateJWT(token);
this.sessions.set(token, session); // ❌ Only exists on this instance
return session;
}
}
2. JWT Validation Overhead
Without session persistence, your MCP server performs full JWT validation on every request:
async function validateJWT(token: string): Promise<MCPSession> {
// Step 1: Fetch JWKS (Network call ~50ms)
const jwks = await fetchJWKS('https://auth.workos.com/oauth2/jwks');
// Step 2: Verify signature (CPU intensive ~10ms)
const verified = await jose.jwtVerify(token, jwks);
// Step 3: Check claims (CPU ~5ms)
if (verified.payload.iss !== 'https://auth.workos.com') {
throw new Error('Invalid issuer');
}
// Step 4: Extract session data
return {
userId: verified.payload.sub,
email: verified.payload.email,
organizationId: verified.payload.org_id,
scopes: verified.payload.scopes,
token: token
};
}
This adds 50-100ms to every request and increases costs significantly.
3. Re-authentication Fatigue
The user experience without session persistence:
Timeline of a frustrated developer:
0:00 - Connect to MCP server ✓
0:01 - Authenticate via WorkOS ✓
0:02 - Create requirement ✓
0:05 - (Cloud Run scales instance to zero)
0:10 - Try to update task ✗ "Please authenticate again"
0:11 - Re-authenticate 😤
0:12 - Update task ✓
0:15 - (New instance due to load)
0:16 - Try to commit ✗ "Please authenticate again"
0:17 - Rage quit
Technical Solution: Redis Session Store with Encryption
Architecture Overview
The solution implements a multi-tier caching strategy with security at its core:
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Request │────▶│ Memory │────▶│ Redis │
│ │ │ Cache │ │ Cache │
└─────────────┘ └─────────────┘ └─────────────┘
│ │
▼ ▼
┌─────────────┐ ┌─────────────┐
│ JWT │ │ JWT │
│ Validation │ │ Validation │
└─────────────┘ └─────────────┘
Implementation Details
Session Store with AES-256-GCM Encryption
The session store implements military-grade encryption for sensitive session data:
import { Redis } from 'ioredis';
import crypto from 'crypto';
import { MCPSession } from './types.js';
import { logger } from './logger.js';
export class SessionStore {
private redis: Redis | null = null;
private encryptionKey: Buffer | null = null;
private algorithm = 'aes-256-gcm';
private readonly ttl: number;
private keyPrefix = 'mcp:session:';
constructor() {
// Only initialize for httpStream transport
if (process.env.MCP_TRANSPORT !== 'httpStream') {
logger.debug('SessionStore not initialized - stdio transport');
return;
}
const redisUrl = process.env.REDIS_URL;
const encryptionKeyHex = process.env.ENCRYPTION_KEY;
if (!redisUrl || !encryptionKeyHex) {
logger.warn('Session persistence disabled - missing configuration');
return;
}
// Validate encryption key length
if (encryptionKeyHex.length !== 64) {
throw new Error('ENCRYPTION_KEY must be 32 bytes (64 hex characters)');
}
this.encryptionKey = Buffer.from(encryptionKeyHex, 'hex');
this.ttl = parseInt(process.env.SESSION_CACHE_TTL || '604800', 10);
// Configure Redis with production-ready settings
this.redis = new Redis(redisUrl, {
// Retry strategy with exponential backoff
retryStrategy: (times) => {
const delay = Math.min(times * 50, 2000);
logger.debug(`Redis retry attempt ${times}, delay: ${delay}ms`);
return delay;
},
// Reconnect on READONLY errors (Redis failover)
reconnectOnError: (err) => {
const shouldReconnect = err.message.includes('READONLY');
if (shouldReconnect) {
logger.warn('Redis READONLY error, reconnecting...');
}
return shouldReconnect;
},
// Connection settings
connectTimeout: 10000,
maxRetriesPerRequest: 3,
enableReadyCheck: true,
enableOfflineQueue: false, // Fail fast in production
});
// Monitor Redis connection health
this.redis.on('connect', () => logger.info('Redis connected'));
this.redis.on('ready', () => logger.info('Redis ready'));
this.redis.on('error', (err) => logger.error({ err }, 'Redis error'));
this.redis.on('close', () => logger.warn('Redis connection closed'));
}
/**
* Check if session store is available
*/
isAvailable(): boolean {
return this.redis !== null &&
this.redis.status === 'ready' &&
this.encryptionKey !== null;
}
/**
* Store encrypted session with automatic expiration
*/
async storeSession(session: MCPSession): Promise<void> {
if (!this.isAvailable()) {
logger.debug('Session store unavailable, skipping storage');
return;
}
try {
// Generate unique IV for each encryption
const iv = crypto.randomBytes(16);
const cipher = crypto.createCipheriv(this.algorithm, this.encryptionKey!, iv);
// Encrypt session data
const sessionJson = JSON.stringify(session);
const encrypted = Buffer.concat([
cipher.update(sessionJson, 'utf8'),
cipher.final()
]);
// Get authentication tag for GCM
const authTag = cipher.getAuthTag();
// Combine components: IV (16) + AuthTag (16) + Encrypted Data
const combined = Buffer.concat([iv, authTag, encrypted]);
const encoded = combined.toString('base64');
// Store with TTL
const key = `${this.keyPrefix}${session.userId}`;
await this.redis!.setex(key, this.ttl, encoded);
logger.debug({
userId: session.userId,
keySize: encoded.length,
ttl: this.ttl
}, 'Session stored successfully');
} catch (error) {
logger.error({ error }, 'Failed to store session');
// Don't throw - graceful degradation
}
}
/**
* Retrieve and decrypt session
*/
async getSession(userId: string): Promise<MCPSession | null> {
if (!this.isAvailable()) {
return null;
}
const startTime = Date.now();
try {
const key = `${this.keyPrefix}${userId}`;
const encoded = await this.redis!.get(key);
if (!encoded) {
logger.debug({ userId }, 'Session not found in cache');
return null;
}
// Decode and extract components
const combined = Buffer.from(encoded, 'base64');
const iv = combined.slice(0, 16);
const authTag = combined.slice(16, 32);
const encrypted = combined.slice(32);
// Decrypt with authentication
const decipher = crypto.createDecipheriv(this.algorithm, this.encryptionKey!, iv);
decipher.setAuthTag(authTag);
const decrypted = Buffer.concat([
decipher.update(encrypted),
decipher.final()
]);
const session = JSON.parse(decrypted.toString('utf8')) as MCPSession;
const elapsed = Date.now() - startTime;
logger.debug({ userId, elapsed }, 'Session retrieved from cache');
return session;
} catch (error) {
if (error instanceof Error && error.message.includes('Unsupported state or unable to authenticate data')) {
logger.error({ userId }, 'Session decryption failed - possible tampering');
} else {
logger.error({ error, userId }, 'Failed to retrieve session');
}
return null;
}
}
/**
* Remove session (for logout)
*/
async removeSession(userId: string): Promise<void> {
if (!this.isAvailable()) return;
try {
const key = `${this.keyPrefix}${userId}`;
await this.redis!.del(key);
logger.debug({ userId }, 'Session removed');
} catch (error) {
logger.error({ error, userId }, 'Failed to remove session');
}
}
/**
* Clean shutdown
*/
async close(): Promise<void> {
if (this.redis) {
await this.redis.quit();
this.redis = null;
}
}
}
// Singleton instance
export const sessionStore = new SessionStore();
Optimized Authentication Middleware
The authentication middleware implements a fast-path/slow-path pattern:
import { IncomingMessage } from 'http';
import crypto from 'crypto';
import { decodeJwt, createRemoteJWKSet, jwtVerify, JWTPayload } from 'jose';
import { sessionStore } from './session-store.js';
import { logger } from './logger.js';
import { MCPSession } from './types.js';
export async function authenticateRequest(request: IncomingMessage): Promise<MCPSession> {
const requestId = crypto.randomUUID();
const startTime = Date.now();
logger.debug({
requestId,
method: request.method,
url: request.url
}, 'Authentication request started');
try {
// Extract bearer token
const token = extractBearerToken(request);
if (!token) {
throw new UnauthorizedError('No bearer token provided');
}
// Fast path: Try to decode JWT for userId
let userId: string | null = null;
let tokenExp: number | null = null;
try {
const decoded = decodeJwt(token);
userId = decoded.sub || null;
tokenExp = decoded.exp || null;
// Quick expiration check
if (tokenExp && tokenExp < Date.now() / 1000) {
logger.debug({ requestId, userId }, 'Token expired, skipping cache');
userId = null; // Force validation
}
} catch (error) {
logger.debug({ requestId }, 'Failed to decode JWT for cache lookup');
}
// Try cache if we have a userId
if (userId && sessionStore.isAvailable()) {
const cached = await sessionStore.getSession(userId);
if (cached && cached.token === token) {
const elapsed = Date.now() - startTime;
logger.info({
requestId,
userId,
elapsed,
source: 'cache'
}, 'Authentication successful (cached)');
return cached;
}
}
// Slow path: Full JWT validation
logger.debug({ requestId }, 'Cache miss, performing JWT validation');
const session = await validateJWTWithWorkOS(token);
// Store for next time
if (sessionStore.isAvailable()) {
await sessionStore.storeSession(session);
}
const elapsed = Date.now() - startTime;
logger.info({
requestId,
userId: session.userId,
elapsed,
source: 'jwt'
}, 'Authentication successful (validated)');
return session;
} catch (error) {
const elapsed = Date.now() - startTime;
logger.error({
requestId,
error: error instanceof Error ? error.message : 'Unknown error',
elapsed
}, 'Authentication failed');
// Return proper HTTP response for MCP
if (error instanceof UnauthorizedError) {
throw new Response(null, {
status: 401,
headers: {
'WWW-Authenticate': `Bearer error="unauthorized", ` +
`error_description="${error.message}", ` +
`resource_metadata="${getResourceMetadataUrl()}"`
}
});
}
throw error;
}
}
function extractBearerToken(request: IncomingMessage): string | null {
const authHeader = request.headers.authorization;
if (!authHeader) return null;
const match = authHeader.match(/^Bearer (.+)$/);
return match ? match[1] : null;
}
class UnauthorizedError extends Error {
constructor(message: string) {
super(message);
this.name = 'UnauthorizedError';
}
}
function getResourceMetadataUrl(): string {
const host = process.env.MCP_HOST || 'https://mcp.braingrid.ai';
return `${host}/.well-known/oauth-protected-resource`;
}
// JWT validation with WorkOS
const jwksCache = new Map<string, ReturnType<typeof createRemoteJWKSet>>();
async function validateJWTWithWorkOS(token: string): Promise<MCPSession> {
const issuer = process.env.WORKOS_ISSUER || 'https://auth.workos.com';
try {
// Get or create JWKS
let jwks = jwksCache.get(issuer);
if (!jwks) {
jwks = createRemoteJWKSet(new URL(`${issuer}/oauth2/jwks`));
jwksCache.set(issuer, jwks);
}
// Verify JWT with options
const verifyOptions: any = {
issuer,
algorithms: ['RS256'],
};
// Only check audience if configured
if (process.env.WORKOS_CLIENT_ID) {
verifyOptions.audience = process.env.WORKOS_CLIENT_ID;
}
const { payload } = await jwtVerify(token, jwks, verifyOptions);
// Validate required claims
if (!payload.sub || !payload.email || !payload.org_id) {
throw new Error('Missing required JWT claims');
}
// Create session from JWT claims
return {
userId: payload.sub,
email: payload.email as string,
organizationId: payload.org_id as string,
scopes: Array.isArray(payload.scopes) ? payload.scopes : [],
token,
};
} catch (error) {
logger.error({ error: error instanceof Error ? error.message : 'Unknown error' }, 'JWT validation failed');
throw error;
}
}
Graceful Degradation
The implementation handles Redis failures gracefully by simply returning null and forcing re-authentication. This is intentional - in a serverless environment, there's no point in falling back to in-memory caching since each instance has its own memory. Better to fail fast and have the user re-authenticate than to create inconsistent state.
Production Deployment Strategies
Cloud Run Configuration
Create a comprehensive deployment configuration:
## Multi-stage build for optimization
FROM node:22-alpine AS builder
WORKDIR /app
## Copy package files
COPY package*.json ./
COPY pnpm-lock.yaml ./
## Install dependencies
RUN npm install -g pnpm && pnpm install --frozen-lockfile
## Copy source code
COPY . .
## Build TypeScript
RUN pnpm run build
## Production stage
FROM node:22-alpine
WORKDIR /app
## Install production dependencies only
COPY package*.json ./
COPY pnpm-lock.yaml ./
RUN npm install -g pnpm && pnpm install --prod --frozen-lockfile
## Copy built application
COPY --from=builder /app/dist ./dist
## Set environment
ENV NODE_ENV=production
ENV MCP_TRANSPORT=httpStream
## Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
CMD node -e "require('http').get('http://localhost:8080/health', (r) => process.exit(r.statusCode === 200 ? 0 : 1))"
EXPOSE 8080
CMD ["node", "dist/server.js"]
Deploy with proper configuration:
#!/bin/bash
## deploy-cloud-run.sh
PROJECT_ID="your-project-id"
SERVICE_NAME="braingrid-mcp-server"
REGION="us-central1"
REDIS_URL="rediss://<your-redis-instance>"
## Build and push image
gcloud builds submit --tag gcr.io/${PROJECT_ID}/${SERVICE_NAME}
## Deploy to Cloud Run
gcloud run deploy ${SERVICE_NAME} \
--image gcr.io/${PROJECT_ID}/${SERVICE_NAME} \
--platform managed \
--region ${REGION} \
--allow-unauthenticated \
--set-env-vars "MCP_TRANSPORT=httpStream" \
--set-env-vars "BRAINGRID_ENV=production" \
--set-env-vars "REDIS_URL=${REDIS_URL}" \
--set-secrets "ENCRYPTION_KEY=mcp-encryption-key:latest" \
--cpu 1 \
--memory 512Mi \
--min-instances 1 \
--max-instances 100 \
--concurrency 80 \
--timeout 300
Vercel Configuration
For Vercel deployment, create vercel.json:
{
"version": 2,
"builds": [
{
"src": "dist/server.js",
"use": "@vercel/node"
}
],
"routes": [
{
"src": "/health",
"dest": "/dist/server.js"
},
{
"src": "/mcp",
"dest": "/dist/server.js"
},
{
"src": "/.well-known/oauth-protected-resource",
"dest": "/dist/server.js"
}
],
"env": {
"MCP_TRANSPORT": "httpStream",
"NODE_ENV": "production"
}
}
Monitoring and Debugging
Structured Logging
Implement comprehensive logging for production debugging:
import pino from 'pino';
// Configure structured logging
export const logger = pino({
level: process.env.LOG_LEVEL || 'info',
transport: process.env.NODE_ENV === 'production' ? undefined : {
target: 'pino-pretty',
options: {
colorize: true,
translateTime: 'HH:MM:ss Z',
ignore: 'pid,hostname'
}
},
formatters: {
level: (label) => {
return { level: label };
}
},
serializers: {
req: (req) => ({
method: req.method,
url: req.url,
headers: {
...req.headers,
authorization: req.headers.authorization ? '[REDACTED]' : undefined
}
}),
err: pino.stdSerializers.err
}
});
// Request tracking middleware
import { IncomingMessage, ServerResponse } from 'http';
export function requestLogging() {
return (req: IncomingMessage, res: ServerResponse, next: () => void) => {
const start = Date.now();
const requestId = crypto.randomUUID();
// Attach to request
(req as any).requestId = requestId;
// Log request
logger.info({
requestId,
req,
type: 'request'
}, 'Incoming request');
// Log response
res.on('finish', () => {
const elapsed = Date.now() - start;
logger.info({
requestId,
statusCode: res.statusCode,
elapsed,
type: 'response'
}, 'Request completed');
});
next();
};
}
Metrics Collection
For production deployments, export metrics to your observability platform:
// Example: Exporting MCP tool call metrics to DataDog
import { StatsD } from 'node-dogstatsd';
const dogstatsd = new StatsD({
host: process.env.DD_AGENT_HOST || 'localhost',
port: 8125,
prefix: 'mcp.server.',
tags: [`env:${process.env.BRAINGRID_ENV || 'development'}`]
});
// Track tool usage
export function recordToolCall(toolName: string, duration: number, success: boolean) {
// Record timing metric
dogstatsd.timing('tool.call.duration', duration, [
`tool:${toolName}`,
`status:${success ? 'success' : 'failure'}`
]);
// Increment counter
dogstatsd.increment('tool.call.count', 1, [
`tool:${toolName}`,
`status:${success ? 'success' : 'failure'}`
]);
}
// In your tool implementation:
server.addTool({
name: 'create_requirement',
execute: async (args, context) => {
const startTime = Date.now();
try {
const result = await apiClient.createRequirement(args);
recordToolCall('create_requirement', Date.now() - startTime, true);
return result;
} catch (error) {
recordToolCall('create_requirement', Date.now() - startTime, false);
throw error;
}
}
});
Performance Optimization
Connection Pooling
Optimize Redis connections for serverless:
// Redis connection pool for serverless
export class RedisConnectionPool {
private static instance: Redis | null = null;
static getInstance(): Redis | null {
if (!this.instance && process.env.REDIS_URL) {
this.instance = new Redis(process.env.REDIS_URL, {
// Connection pool settings
maxRetriesPerRequest: 3,
enableReadyCheck: true,
lazyConnect: true, // Important for serverless
// Serverless-optimized timeouts
connectTimeout: 5000,
commandTimeout: 5000,
// Connection reuse
keepAlive: 30000,
noDelay: true,
// Handle connection errors gracefully
retryStrategy: (times) => {
if (times > 3) return null; // Stop retrying
return Math.min(times * 100, 3000);
}
});
// Ensure connection is established
this.instance.connect().catch((err: Error) => {
logger.error({ err: err.message }, 'Redis connection failed');
this.instance = null;
});
}
return this.instance;
}
static async close(): Promise<void> {
if (this.instance) {
await this.instance.quit();
this.instance = null;
}
}
}
Request Batching
Optimize for concurrent requests:
export class BatchedJWTValidator {
private readonly pendingValidations = new Map<string, Promise<MCPSession>>();
async validateToken(token: string): Promise<MCPSession> {
// Check if validation is already in progress
if (this.pendingValidations.has(token)) {
logger.debug('Reusing pending validation');
return this.pendingValidations.get(token)!;
}
// Start new validation
const validationPromise = this.performValidation(token)
.finally(() => {
// Clean up after completion
this.pendingValidations.delete(token);
});
this.pendingValidations.set(token, validationPromise);
return validationPromise;
}
private async performValidation(token: string): Promise<MCPSession> {
// Actual JWT validation logic
return validateJWTWithWorkOS(token);
}
}
Conclusion
Hosting MCP servers in serverless environments is challenging, but the patterns we've covered make it possible to build production-ready solutions that scale.
The key technical takeaways:
- Session persistence is non-negotiable - Without Redis or similar external storage, your users face constant re-authentication
- Security can't be an afterthought - Proper encryption (AES-256-GCM) and secure token handling are essential
- Fast-path optimization matters - JWT validation is expensive; caching authenticated sessions dramatically improves performance
- Graceful degradation over complex fallbacks - When Redis fails, force re-authentication rather than trying clever in-memory solutions
- Observable systems are debuggable systems - Export metrics to DataDog or your platform of choice
By solving these challenges, we transformed our MCP server from a local development tool into infrastructure that our entire team relies on. The same patterns apply whether you're building tools for internal use or creating MCP servers for the broader community.
The future of development involves AI assistants that understand context and can take meaningful actions. Making that future accessible to teams - not just individual developers - requires solving the infrastructure challenges we've outlined here.
Originally published on the BrainGrid blog.
Top comments (0)