The Problem
You started with everything in Next.js — API routes, UI, and background jobs (video processing, LLM calls, file transforms). It worked great at first, but now:
- OOM crashes — background jobs eat memory that Vercel/your host needs for serving pages
- Cold starts — heavy deps (ffmpeg, Chromium for rendering) inflate your bundle
- Scaling mismatch — you want to scale compute independently from your frontend
The fix: pull your background functions into a standalone Express worker. Here's the practical playbook.
Architecture Before & After
Before: One Next.js app does everything
Next.js (Vercel)
├── pages / app router (UI)
├── API routes (REST)
└── Background jobs (Inngest functions)
├── process-video
├── generate-clips
└── export
After: Next.js serves UI + API; Worker handles compute
Next.js (Vercel) Worker (Railway/Fly/EC2)
├── UI ├── Express server
├── API routes ├── POST /api/inngest
└── Inngest client └── GET /api/health
(dispatches jobs) (runs the jobs)
Step 1: Create the Worker Directory
worker/
├── src/
│ ├── server.ts
│ ├── lib/ # shared utilities (copied from src/lib)
│ └── functions/ # Inngest function handlers
├── package.json
├── tsconfig.json
└── Dockerfile
The worker has its own package.json with only the deps it needs — no React, no Next.js, no UI libraries.
Step 2: The Express Server (~30 lines)
import express from 'express';
import { serve } from 'inngest/express';
import { inngest } from './lib/inngest-client';
import { processVideo } from './functions/process-video';
import { generateClips } from './functions/generate-clips';
import { clipExport } from './functions/clip-export';
const app = express();
app.use(
'/api/inngest',
serve({
client: inngest,
functions: [processVideo, generateClips, clipExport],
})
);
app.get('/api/health', (_req, res) => {
res.json({
status: 'ok',
service: 'worker',
version: process.env.GIT_COMMIT_SHA?.slice(0, 7) ?? 'local',
});
});
const PORT = parseInt(process.env.PORT ?? '3001', 10);
app.listen(PORT, '0.0.0.0', () => {
console.log(`[worker] listening on 0.0.0.0:${PORT}`);
});
That's it. No framework magic. Inngest's serve() adapter handles all the webhook routing.
Step 3: Sync Shared Code with a Script
The tricky part: your worker needs some shared utilities (supabase.ts, r2.ts, llm.ts) that live in src/lib/. But they use Next.js path aliases like @/lib/....
Solution: a sync script that copies and rewrites imports:
#!/usr/bin/env bash
# sync-worker.sh — copy shared source, rewrite @/ aliases
set -e
ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
SRC="$ROOT/src"
WORKER="$ROOT/worker/src"
rewrite_lib_imports() {
local file="$1"
sed \\
-e 's|from "@/lib/\([^"]*\)"|from "./\1"|g' \\
-e 's|await import("@/lib/\([^"]*\)")|await import("./\1")|g' \\
"$file"
}
WORKER_LIBS=(supabase r2 llm transcribe)
for f in "${WORKER_LIBS[@]}"; do
rewrite_lib_imports "$SRC/lib/$f.ts" > "$WORKER/lib/$f.ts"
echo " ✓ lib/$f.ts"
done
Watch out for name clashes! If you have src/lib/inngest.ts and the inngest npm package, Node's module resolution gets confused. Rename your client file: inngest.ts → inngest-client.ts in the worker.
Step 4: The Dockerfile
FROM node:20-slim
# System deps your jobs need
RUN apt-get update && apt-get install -y \\
ffmpeg python3 curl \\
# Chromium deps for headless rendering
fonts-liberation libnss3 libatk-bridge2.0-0 \\
libdrm2 libxkbcommon0 libgbm1 libasound2 \\
--no-install-recommends \\
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
# Install ALL deps (devDeps needed for tsc)
COPY worker/package*.json ./
RUN npm ci
COPY worker/ .
# Build TypeScript
RUN npm run build
# Prune devDeps after build
RUN npm prune --omit=dev
EXPOSE 3001
# Cap heap — this is a lean worker, not a Next.js server
CMD ["node", "--max-old-space-size=200", "dist/server.js"]
Key insight: install all deps → build → prune. You need TypeScript and type packages during tsc, but not at runtime. Pruning after build keeps the image small.
Step 5: Deploy Config
For Railway, point to the worker Dockerfile:
# railway.toml
[build]
dockerfilePath = "worker/Dockerfile"
[deploy]
healthcheckPath = "/api/health"
healthcheckTimeout = 120
⚠️ Gotcha: If you also have a railway.json, it can override railway.toml. Pick one and delete the other.
Step 6: Memory Tuning
Different phases need different memory limits:
# Build phase — tsc needs more memory
# Set via ENV or railway build config
ENV NODE_OPTIONS="--max-old-space-size=512"
# Runtime — lean worker
CMD ["node", "--max-old-space-size=200", "dist/server.js"]
--max-old-space-size in CMD overrides NODE_OPTIONS for the server process. But child processes (like spawned ffmpeg wrappers via execFileAsync) inherit NODE_OPTIONS, so set that separately if needed.
Step 7: Remove AI/Compute Keys from Vercel
Once the worker handles all compute, strip those env vars from Vercel:
GEMINI_API_KEYGROQ_API_KEY-
R2_SECRET_ACCESS_KEY(if only the worker uploads)
Smaller attack surface, cleaner separation.
Step 8: Health Check from Next.js
Your Next.js app can proxy a health check to the worker:
// src/app/api/health/route.ts
export async function GET() {
const workerUrl = process.env.WORKER_URL;
const [dbCheck, workerCheck] = await Promise.all([
checkSupabase(),
fetch(`${workerUrl}/api/health`, { signal: AbortSignal.timeout(5000) })
.then(r => r.json())
.catch(() => ({ status: 'down' })),
]);
const overall = [dbCheck, workerCheck].every(c => c.status === 'ok')
? 'ok' : 'degraded';
return Response.json({ status: overall, db: dbCheck, worker: workerCheck });
}
Results
After the split:
- Vercel stays under 256MB — no more OOM from background jobs
- Worker scales independently (bump Railway instance when processing spikes)
- Deploy times drop — worker image only rebuilds when compute code changes
- Cold starts vanish for the frontend — no heavy deps in the Next.js bundle
Checklist
- [ ] Create
worker/with its ownpackage.jsonandtsconfig.json - [ ] Write a sync script for shared code (rewrite path aliases)
- [ ] Watch for module name clashes (
inngest.tsvsinngestpackage) - [ ] Dockerfile: install all → build → prune devDeps
- [ ] Cap heap separately for build vs runtime
- [ ] Pick
railway.tomlORrailway.json, not both - [ ] Bind health check to
0.0.0.0, notlocalhost - [ ] Remove compute-only env vars from your frontend host
- [ ] Add worker health to your aggregated
/api/health
This came from actually doing the migration on a production app today. Every gotcha listed above cost real debugging time.
Top comments (0)