Sumit Dey

Posted on Apr 5 • Originally published at videocaptions.ai

I Migrated My SaaS from Vercel to Cloudflare Workers — Here's Everything That Broke

#cloudflare #vercel #webdev #react

I Migrated My SaaS from Vercel to Cloudflare Workers — Here's Everything That Broke (and How I Fixed It)

I run VideoCaptions.AI — a free AI video caption generator where you upload a video, get word-level transcription, style your animated captions with effects, and export MP4. It supports 30+ languages including Hinglish captions and works great for Instagram Reels, TikTok, and YouTube Shorts. It's built with React Router v7 (SSR), uses auth, Convex for the backend, Remotion for video rendering, and calls multiple AI services for speech-to-text.

It was running on Vercel. Then one weekend, I got more traffic than I expected.

The trigger

I hit Vercel's CPU limit. Not gradually — it just stopped working during a traffic spike. The free tier gives you 100GB bandwidth and limited CPU hours, and apparently my server-side rendered SEO pages plus API routes (which call AI services) were burning through compute faster than I realized.

The irony? This was a good sign. The app was getting traction. But I needed a hosting solution that wouldn't punish me for it.

Cloudflare was already in my stack — my domain was registered there, I was using R2 for file storage, and my existing upload Worker was already running on their platform. Moving the rest felt like the natural next step rather than paying $20/month for Vercel Pro.

What I was migrating

This wasn't a landing page. The app has:

React Router v7 in framework mode with SSR
80+ prerendered SEO pages (platform pages, competitor comparisons, language pages)
5 API routes that call ElevenLabs for speech-to-text, OpenAI for LLM clip segmentation, and Groq as a fallback transcription provider
Auth with JWT verification on both client and server
Convex as the database (billing, credits, project storage)
Remotion for frame-accurate video composition and MP4 export
Upstash Redis for API rate limiting

This is a monorepo with 12 apps and 23 shared packages. Only the captions app was moving — everything else stayed on Vercel.

The strategy: don't flip everything at once

I split the migration into three phases, each independently reversible:

Phase 1: Move API routes to a Cloudflare Worker (frontend stays on Vercel)
Phase 2: Move the full site (SSR + static assets) to a separate Worker
Phase 3: DNS cutover — point the domain from Vercel to Cloudflare

If Phase 1 breaks, flip one env var and traffic goes back to Vercel. If Phase 2 breaks, remove the DNS route and Vercel takes over again. No all-or-nothing gambles.

Phase 1: API routes → Cloudflare Workers

I already had a Hono-based Worker handling R2 file uploads. Instead of creating a new Worker, I extended it with the API routes.

The AWS SDK doesn't work in Workers

First surprise: @aws-sdk/client-s3 (which I used for R2 presigned URLs) relies on Node.js internals that don't exist in the Workers runtime. The fix was swapping to aws4fetch — a lightweight S3-compatible signing library built for Workers.

// Before (Vercel — Node.js)
import { S3Client, PutObjectCommand } from "@aws-sdk/client-s3";
import { getSignedUrl } from "@aws-sdk/s3-request-presigner";

const s3 = new S3Client({ region: "auto", endpoint: r2Endpoint, credentials });
const url = await getSignedUrl(s3, new PutObjectCommand({ Bucket, Key }), { expiresIn: 600 });

// After (Cloudflare Workers)
import { AwsClient } from "aws4fetch";

const aws = new AwsClient({ accessKeyId, secretAccessKey, service: "s3", region: "auto" });
const putUrl = new URL(objectUrl);
putUrl.searchParams.set("X-Amz-Expires", "600");
const signed = await aws.sign(new Request(putUrl.toString(), { method: "PUT" }), { aws: { signQuery: true } });

Porting one route at a time

I migrated routes in order of complexity:

/api/r2/presign — simplest, easy to verify (upload a file, check the URL works)
/api/transliterate — single LLM call, straightforward
/api/ai-clips — LLM with structured output, more validation logic
/api/transcribe — the big one: 3 STT providers, multi-language pipeline, FormData handling

Each route got its own commit, its own test, its own deploy. If something broke, I knew exactly which route caused it.

The frontend toggle

I added a single constant to the frontend:

// src/lib/api-base.ts
export const API_BASE_URL = import.meta.env.VITE_API_BASE_URL || "";

Every fetch("/api/...") became fetch(${API_BASE_URL}/api/...). When the env var is empty, calls go to Vercel (same-origin). When set, they go to the Worker. One Vercel env var change = instant traffic switch, zero code changes.

Custom domain

I set up api.videocaptions.ai as a Worker custom domain. One line in wrangler.toml:

routes = [
  { pattern = "api.videocaptions.ai", custom_domain = true }
]

Deploy, and Cloudflare auto-creates the DNS record + SSL certificate. The API was live at a clean URL in seconds.

Phase 2a: Swapping the build system

This is where I replaced Vercel's build pipeline with Cloudflare's.

Config changes

The core swap is three files:

vite.config.ts — replace the Vercel middleware with Cloudflare's plugin:

import { cloudflare } from "@cloudflare/vite-plugin";
import { reactRouter } from "@react-router/dev/vite";
import tailwindcss from "@tailwindcss/vite";
import tsconfigPaths from "vite-tsconfig-paths";

export default defineConfig(({ mode }) => ({
  plugins: [
    cloudflare({ viteEnvironment: { name: "ssr" } }),
    tailwindcss(),
    reactRouter(),
    tsconfigPaths(),
  ],
  // ... rest of config
}));

react-router.config.ts — remove vercelPreset(), add the Cloudflare future flag:

export default {
  appDirectory: "src",
  ssr: true,
  future: {
    v8_viteEnvironmentApi: true,
  },
} satisfies Config;

workers/app.ts — the Worker entry point that handles every request:

import { createRequestHandler } from "react-router";

declare module "react-router" {
  export interface AppLoadContext {
    cloudflare: { env: Env; ctx: ExecutionContext };
  }
}

const requestHandler = createRequestHandler(
  () => import("virtual:react-router/server-build"),
  import.meta.env.MODE
);

export default {
  async fetch(request, env, ctx) {
    const url = new URL(request.url);

    // www → non-www redirect
    if (url.hostname === "www.videocaptions.ai") {
      url.hostname = "videocaptions.ai";
      return Response.redirect(url.toString(), 301);
    }

    const response = await requestHandler(request, {
      cloudflare: { env, ctx },
    });

    // Security headers
    const headers = new Headers(response.headers);
    headers.set("X-Content-Type-Options", "nosniff");
    headers.set("X-Frame-Options", "SAMEORIGIN");
    headers.set("Referrer-Policy", "strict-origin-when-cross-origin");
    headers.set("Strict-Transport-Security", "max-age=31536000; includeSubDomains");

    // Cache immutable assets
    if (url.pathname.startsWith("/assets/") || url.pathname.startsWith("/fonts/")) {
      headers.set("Cache-Control", "public, max-age=31536000, immutable");
    }

    return new Response(response.body, {
      status: response.status, statusText: response.statusText, headers,
    });
  },
} satisfies ExportedHandler<Env>;

The SSR entry point

Workers use Web Streams, not Node.js streams. I had to create entry.server.tsx with renderToReadableStream instead of renderToPipeableStream:

import { renderToReadableStream } from "react-dom/server";
import { ServerRouter } from "react-router";
import { isbot } from "isbot";

export default async function handleRequest(request, statusCode, headers, routerContext) {
  const body = await renderToReadableStream(
    <ServerRouter context={routerContext} url={request.url} />
  );

  // Bots get fully-rendered HTML for SEO
  if (isbot(request.headers.get("user-agent"))) {
    await body.allReady;
  }

  headers.set("Content-Type", "text/html");
  return new Response(body, { headers, status: statusCode });
}

If you skip this file, the default React Router entry uses Node.js streams and your Worker crashes on every request.

Phase 2b: The bugs that only show up in Workers

The build passed. TypeScript was clean. All 1122 tests passed. Then I deployed and got Error 1101: Worker threw exception.

Bug 1: `setTimeout` in global scope

The error message was cryptic:

Disallowed operation called within global scope.
Asynchronous I/O (ex: fetch() or connect()), setting a timeout,
and generating random values are not allowed within global scope.

Workers have a strict rule: no I/O operations outside of the fetch() handler. My font loading module had this:

// fonts.ts — runs at import time
loadCriticalFonts();
if (typeof setTimeout !== "undefined") {
  setTimeout(loadDeferredFonts, 0);  // THIS KILLS THE WORKER
}

The guard typeof setTimeout !== "undefined" doesn't help — setTimeout IS defined in Workers, it's just disallowed during module initialization. Fix:

if (typeof document !== "undefined") {
  loadCriticalFonts();
  setTimeout(loadDeferredFonts, 0);
}

This only runs in the browser, never during SSR.

Bug 2: 32MB server bundle

The deploy failed with Worker exceeded the size limit of 3 MiB. The server build was 32MB. Why?

ort-wasm-simd-threaded.wasm    21 MB   ← ONNX Runtime (ML inference)
server-build.js                 7.5 MB  ← actual SSR bundle
mediabunny-*-encoder.js         2 MB    ← audio encoders

Client-only libraries were leaking into the server bundle through import chains. The 21MB WASM file was from a ML feature that only runs in the browser. I disabled the route that pulled it in, and the bundle dropped to 10MB uncompressed, 2.2MB gzipped.

Bug 3: Black screen in dev

After all the config changes, pnpm dev showed a blank page. The HTML was being served, but it was the wrong HTML — a stale index.html from the old Vite SPA setup was sitting in the project root. The Cloudflare plugin saw it and served it as a static asset, completely bypassing React Router's SSR.

Deleted the file. Everything worked.

Thirty minutes of debugging for a file that shouldn't have existed.

Phase 2c: Getting auth to work

The app worked on the workers.dev test URL — pages rendered, navigation worked, static assets loaded. But auth returned 401 on every request.

Test key vs production key

My local .env had a test auth key (pk_test_...). When I built the app, Vite baked this into the client bundle. But the API Worker had the production secret key (sk_live_...). Test tokens can't be verified by a production secret — they're different key pairs.

The fix was creating .env.production with production-only vars:

VITE_CLERK_PUBLISHABLE_KEY=pk_live_...
VITE_CONVEX_URL=https://prod-deployment.convex.cloud
VITE_API_BASE_URL=https://api.videocaptions.ai
VITE_POSTHOG_KEY=phc_...

Vite loads .env.production during production builds, overriding .env. But there's a gotcha: .env.local has HIGHER priority than .env.production. If you have test values in .env.local, they win. I verified the correct key was in the bundle by grepping the output:

grep -o "pk_live_[A-Za-z0-9_-]*" build/client/assets/*.js

Domain restriction

Auth providers restrict which domains can use your production keys. My workers.dev test URL wasn't in the allowed list. This isn't a bug — it's a security feature. It resolved itself once I switched to the real videocaptions.ai domain.

Phase 3: The DNS cutover

My domain was already registered on Cloudflare, but DNS was pointing to Vercel. The cutover was surprisingly simple.

Step 1: Delete Vercel's DNS records

In the Cloudflare DNS dashboard, I deleted the CNAME record pointing to cname.vercel-dns.com. There was also a stale A record from an old parking page that I had to find and remove.

Step 2: Add the custom domain to the Worker

// wrangler.jsonc
{
  "routes": [
    { "pattern": "videocaptions.ai", "custom_domain": true },
    { "pattern": "www.videocaptions.ai", "custom_domain": true }
  ]
}

Step 3: Deploy

pnpm run deploy

Output confirmed the cutover:

Deployed videocaptions-ai triggers
  https://videocaptions-ai.dev-sumitdey.workers.dev
  videocaptions.ai (custom domain)
  www.videocaptions.ai (custom domain)

That's it. DNS propagated within minutes. Zero downtime.

The rollback plan

If something broke, the rollback was: remove the routes from wrangler.jsonc, deploy, then re-add Vercel's CNAME in the Cloudflare DNS dashboard. Five minutes max. I kept the Vercel project alive for a week just in case.

Environments: mapping Vercel's DX to Cloudflare

One thing Vercel does incredibly well is environment management. Push to a branch → preview deploy. Push to main → production. Zero config.

Here's how I replicated it on Cloudflare:

Preview environment

In wrangler.toml, I added a preview environment:

[env.preview]
name = "captions-r2-worker-preview"

[env.preview.vars]
R2_ACCOUNT_ID = "..."
R2_BUCKET_NAME = "user-assets"

Deploy to preview:

pnpm exec wrangler deploy --env preview

Set preview-specific secrets:

pnpm exec wrangler secret put CLERK_SECRET_KEY --env preview

This gives you a separate Worker at captions-r2-worker-preview.workers.dev with its own secrets — completely isolated from production.

Production

Deploy without --env:

pnpm exec wrangler deploy --env=""

Separate secrets, separate URL, separate everything.

Non-secret config

Values that aren't sensitive go in wrangler.toml as vars — they're committed to git and don't need to be set per-environment:

[vars]
R2_ACCOUNT_ID = "03f3b78d..."
R2_BUCKET_NAME = "user-assets"

Automatic deploys (CI/CD)

Cloudflare Workers Builds connects your GitHub repo:

Dashboard → Workers & Pages → Your Worker → Settings → Builds
Set the root directory (monorepo support)
Configure build watch paths (only rebuild when your app's files change, not when other apps in the monorepo change)
Push to main → automatic deploy

For more control, GitHub Actions with cloudflare/wrangler-action@v3:

- name: Deploy
  uses: cloudflare/wrangler-action@v3
  with:
    apiToken: ${{ secrets.CLOUDFLARE_API_TOKEN }}
    workingDirectory: apps/captions

What Cloudflare gives you for free

Moving from Vercel wasn't just about hosting. I gained an entire platform:

Web Application Firewall — OWASP managed rulesets, custom rules, credential leak detection
Automatic DDoS protection — L3/L4/L7, no configuration needed
Turnstile — free CAPTCHA alternative I can add to sign-up flows
Web Analytics — privacy-first, no cookies, Core Web Vitals
Workers Traces — real-time log streaming with wrangler tail
Zero egress fees — R2 storage + Workers have no bandwidth costs
Custom domains — auto-provisioned SSL, one line of config

On Vercel, some of these require Enterprise. On Cloudflare, they're on the free tier.

What Vercel does better (honestly)

I'd be lying if I said Cloudflare is better at everything:

Preview deploys — Vercel's branch-based preview URLs are truly zero-config. Cloudflare requires setting up Workers Builds or GitHub Actions.
Headers — vercel.json is simpler than coding security headers in a Worker's fetch() handler.
Monorepo detection — Vercel auto-detects which app to build. Cloudflare needs explicit build watch paths.
Dashboard polish — Vercel's UI is more refined. Cloudflare's dashboard is functional but busier.
Prerendering — it just works on Vercel. On Cloudflare, there's a manifest bug with the Vite plugin that breaks it for large apps (more below).

The prerendering problem

My app had 80+ SEO pages prerendered at build time — pages like captions for Instagram Reels, VideoCaptions vs CapCut, and how to add captions to video. Static HTML for fast TTFB and search engine crawlability. This was one line in the React Router config:

async prerender() {
  return ["/", "/terms-of-service", "/privacy",
    ...PLATFORMS.map(p => `/captions-for/${p.slug}`),
    ...COMPETITORS.map(c => `/vs/${c.slug}`),
    // ... 80+ pages
  ];
}

On Cloudflare, this fails with:

[react-router] Server build file not found in manifest

The root cause: the Cloudflare Vite plugin replaces virtual:react-router/server-build in the SSR manifest with virtual:cloudflare/worker-entry. React Router's prerender hook looks for the former and can't find it. Small apps work (the official template prerenders fine), but large bundles that get code-split differently hit this bug.

For now, all pages are SSR'd instead of prerendered. Cloudflare edge SSR is fast enough (~15ms cold start) that the performance difference is minimal. I'm tracking the upstream issue and the community is working on fixes.

What I'm paying now

	Before (Vercel)	After (Cloudflare)
Hosting	Free → hit limits	Workers Paid: $5/month
Storage	R2: ~$0.50/month	R2: ~$0.50/month (same)
Rate limiting	Upstash: Free	Upstash: Free (same)
DDoS protection	Not included	Free
WAF	Enterprise only	Free
Analytics	Included	Free (Web Analytics)
Bandwidth	100GB cap	Unlimited, $0
Total	$0 → needed $20/month	$5.50/month

Verify your migration

Once you've deployed, here's how to verify everything is working. These commands saved me hours of guessing.

Check DNS resolution

# Should show Cloudflare IPs (104.21.x.x, 172.67.x.x), NOT Vercel (76.76.x.x)
dig yourdomain.com A +short

# If your local DNS is cached, query Cloudflare's resolver directly
dig yourdomain.com A +short @1.1.1.1

Confirm the response comes from Cloudflare

# Look for "server: cloudflare" and a cf-ray header
curl -sI https://yourdomain.com | grep -iE "server|cf-ray"

If DNS hasn't propagated yet, force it through Cloudflare's IP:

curl -sI --resolve yourdomain.com:443:104.21.2.24 https://yourdomain.com | grep -iE "server|cf-ray"

Verify security headers

curl -sI https://yourdomain.com | grep -iE "X-Content-Type|X-Frame|Strict-Transport|Referrer-Policy|Permissions-Policy"

Expected output:

x-content-type-options: nosniff
x-frame-options: SAMEORIGIN
strict-transport-security: max-age=31536000; includeSubDomains
referrer-policy: strict-origin-when-cross-origin
permissions-policy: camera=(), microphone=(self), geolocation=()

Check static asset caching

# Vite-hashed assets should have immutable caching
curl -sI https://yourdomain.com/assets/entry.client-BDONYgeu.js | grep -i cache-control
# Expected: cache-control: public, max-age=31536000, immutable

Test www → non-www redirect

curl -sI https://www.yourdomain.com | grep -i location
# Expected: location: https://yourdomain.com/

Verify API routes

# Should return 401 (auth required, meaning the route exists and works)
curl -s -X POST https://api.yourdomain.com/api/r2/presign
# Expected: {"error":"Authentication required."}

# Should return 200 (public endpoint)
curl -s -X POST https://api.yourdomain.com/api/r2/refresh-url \
  -H "Content-Type: application/json" \
  -d '{"key":"uploads/test/file.wav"}'

Stream real-time logs

# Watch every request to your Worker in real-time
npx wrangler tail

# Filter errors only
npx wrangler tail --status error

# Search for specific routes
npx wrangler tail --search "transcribe"

Deploy commands reference

# Site Worker
cd apps/captions
pnpm run deploy                              # build + deploy to production

# API Worker
cd apps/captions-worker
pnpm exec wrangler deploy --env=""           # production
pnpm exec wrangler deploy --env preview      # preview environment

# Set secrets (one-time per environment)
pnpm exec wrangler secret put API_KEY              # production
pnpm exec wrangler secret put API_KEY --env preview # preview

# List secrets
pnpm exec wrangler secret list --env=""

# Check deployed versions
pnpm exec wrangler deployments list --env=""

Flush local DNS cache (macOS)

sudo dscacheutil -flushcache; sudo killall -HUP mDNSResponder

What's next

The migration is done, but there's more to build:

CI/CD pipelines — set up Workers Builds or GitHub Actions so deploys are automatic on push to main, with preview deploys for PRs
Merge Workers — combine the API Worker and site Worker into one. Everything same-origin, no CORS, no separate deploys
Prerendering — re-enable once the Cloudflare Vite plugin fixes the manifest bug. The community is actively working on it.
Edge caching — use Cloudflare Cache Rules to cache SSR responses at the edge for SEO pages that don't change often
Better framework — evaluating TanStack Start for more flexibility and better Cloudflare integration

Would I do it again?

Yes. But with two lessons:

First, upgrade your framework version as a separate PR before touching the hosting. I upgraded React Router 7.7 → 7.14 and migrated to Cloudflare in the same branch. Every bug was harder to diagnose because I couldn't tell if it was a version issue or a platform issue.

Second, check for stale files. That index.html from a year-old Vite SPA setup cost me 30 minutes of debugging a black screen. git clean -fdx before testing would have caught it.

The Vercel DX is genuinely great. But when you hit their limits — and if you're building anything real, you will — Cloudflare gives you the same capabilities at a fraction of the cost, with an entire security and performance platform included for free.

The migration took one focused day. The savings are permanent.

If you're planning a similar migration, feel free to reach out. And if you need captions for your videos, try VideoCaptions.AI — it's free, no watermark, and now running on Cloudflare Workers. The biggest gotchas aren't in any documentation — they're in the gap between "it works locally" and "it works in production."

Top comments (1)

Harjot Singh • May 30

deploy/host stack debates usually miss the bigger leak: the same boring-saas-parts get rebuilt every project. moonshift writes them once and ships your code to YOUR github + vercel for $3 flat per run. first run free, no card. moonshift.io