DEV Community

Carlos Oliva Pascual
Carlos Oliva Pascual

Posted on • Originally published at stacknotice.com

The Production Deployment Checklist Senior Devs Never Skip (2026)

Most outages aren't caused by bad code. They're caused by good code deployed in the wrong order.

Senior developers don't rely on memory before a deploy. They run a checklist — every single time, even for a one-line change.

Here's the exact checklist, and why each step exists.

Why checklists exist

Pilots don't skip the pre-flight checklist because they've flown 10,000 hours. They do it because they've flown 10,000 hours — enough to know exactly what happens when you skip a step.

The same principle applies to production deploys. Every step in this checklist exists because someone, somewhere, had an outage from skipping it.

The 12-step checklist

# Check Why it matters
1 Env vars validate at build Silent undefined in prod = 3 AM alert
2 Migrations run BEFORE deploy New code can't see old schema
3 No drizzle-kit push in prod Applies changes without migration files
4 Feature flag OFF for new features Ship code off, turn on after smoke test
5 Error monitoring configured First error hits Sentry, not a user
6 Health check endpoint responds Load balancer needs /api/health
7 Rate limiting on auth endpoints Login brute-force = account takeover
8 Secrets in env manager, not code Rotating a secret ≠ a new deploy
9 Stripe webhooks tested Webhook signature fails silently
10 Rollback plan ready Know the previous deploy hash
11 Smoke test the critical path Log in → do the main action → verify
12 Alert channel exists Errors go somewhere humans actually see

Step 1 — Env vars validate at build time

If you're using process.env.THING directly, your app will start and fail at runtime when THING is undefined. The error happens in production, at 2 AM, in front of your first real user.

With t3-env, the build fails — which is exactly what you want:

// src/lib/env.ts
import { createEnv } from '@t3-oss/env-nextjs'
import { z } from 'zod'

export const env = createEnv({
  server: {
    DATABASE_URL: z.string().url(),
    CLERK_SECRET_KEY: z.string().min(1),
    STRIPE_SECRET_KEY: z.string().min(1),
    STRIPE_WEBHOOK_SECRET: z.string().min(1),
    SENTRY_DSN: z.string().url(),
  },
  client: {
    NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY: z.string().min(1),
  },
  runtimeEnv: {
    DATABASE_URL: process.env.DATABASE_URL,
    CLERK_SECRET_KEY: process.env.CLERK_SECRET_KEY,
    STRIPE_SECRET_KEY: process.env.STRIPE_SECRET_KEY,
    STRIPE_WEBHOOK_SECRET: process.env.STRIPE_WEBHOOK_SECRET,
    SENTRY_DSN: process.env.SENTRY_DSN,
    NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY: process.env.NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY,
  },
})
Enter fullscreen mode Exit fullscreen mode

If STRIPE_WEBHOOK_SECRET is missing from Vercel, next build fails. You catch it before a single user sees anything.

Pro tip: Add every new env var to env.ts the same moment you add it to .env.local. Never add one without the other.

Step 2 — Migrations run before deploy, always

This is the most important rule in production database management.

❌ WRONG:   Deploy code → Run migrations
✅ CORRECT: Run migrations → Deploy code
Enter fullscreen mode Exit fullscreen mode

Why: during a Vercel deployment, both the old and new versions of your app run simultaneously for a few seconds. The new code expects the new schema. If you deploy code first, new code breaks on the old schema during that window.

With Drizzle:

# Never in production
npx drizzle-kit push

# Always in production
npx drizzle-kit generate   # creates the migration file
npx drizzle-kit migrate    # applies it to the database
Enter fullscreen mode Exit fullscreen mode

Step 3 — drizzle-kit push is banned in production

push applies your schema changes directly, without generating migration files. It's designed for development — fast iteration, no noise.

In production, it means:

  • No audit trail of what changed
  • No ability to roll back a migration
  • Risk of accidental data loss with no undo

Add this rule to your CLAUDE.md and your team's internal docs:

## Database rules
- Never use `drizzle-kit push` in production
- Always `generate` then `migrate`
- Migration files are committed alongside the code that requires them
Enter fullscreen mode Exit fullscreen mode

Step 4 — Feature flags for every new feature

The classic failure mode:

❌ Ship → Users see broken feature → Emergency rollback
✅ Ship (flag OFF) → Smoke test in production → Turn flag ON → Gradual rollout
Enter fullscreen mode Exit fullscreen mode

With Vercel Edge Config:

import { get } from '@vercel/edge-config'

export async function isNewDashboardEnabled(userId: string) {
  const config = await get<{ enabledUserIds: string[] }>('new-dashboard')
  return config?.enabledUserIds.includes(userId) ?? false
}
Enter fullscreen mode Exit fullscreen mode

New feature ships disabled. You test it in production with your own account. When it works, you enable it for 5% of users. If something breaks at 5%, you turn the flag off — no rollback, no deploy, 10 seconds to fix.

Step 5 — Error monitoring before go-live

The key word is before. Your error monitoring must be live and verified before you ship the code that might error.

// sentry.client.config.ts
import * as Sentry from '@sentry/nextjs'

Sentry.init({
  dsn: process.env.NEXT_PUBLIC_SENTRY_DSN,
  environment: process.env.NODE_ENV,
  tracesSampleRate: process.env.NODE_ENV === 'production' ? 0.1 : 1.0,
  beforeSend(event) {
    if (process.env.NODE_ENV === 'development') return null
    return event
  },
})
Enter fullscreen mode Exit fullscreen mode

Verify it works before deploying: throw a test error manually, confirm it shows up in your Sentry dashboard.

Step 6 — Health check endpoint

// src/app/api/health/route.ts
import { db } from '@/lib/db'
import { sql } from 'drizzle-orm'

export const runtime = 'nodejs'

export async function GET() {
  try {
    await db.execute(sql`SELECT 1`)
    return Response.json(
      { status: 'ok', db: 'connected', ts: Date.now() },
      { headers: { 'Cache-Control': 'no-store' } }
    )
  } catch {
    return Response.json(
      { status: 'error', db: 'disconnected' },
      { status: 503 }
    )
  }
}
Enter fullscreen mode Exit fullscreen mode

This checks the actual database connection, not just that Next.js started. Set up an uptime monitor (BetterStack, UptimeRobot, Checkly) to hit /api/health every 60 seconds.

Step 7 — Rate limiting on auth endpoints

Auth endpoints are the most targeted on any public app. Without rate limiting, a brute-force attack on your login endpoint is trivial.

// src/app/api/auth/login/route.ts
import { Ratelimit } from '@upstash/ratelimit'
import { Redis } from '@upstash/redis'

const ratelimit = new Ratelimit({
  redis: Redis.fromEnv(),
  limiter: Ratelimit.slidingWindow(5, '15 m'),
  analytics: true,
})

export async function POST(request: Request) {
  const ip = request.headers.get('x-forwarded-for') ?? 'unknown'
  const { success, reset } = await ratelimit.limit(`login:${ip}`)

  if (!success) {
    return Response.json(
      { error: 'Too many attempts. Try again later.' },
      {
        status: 429,
        headers: { 'Retry-After': String(Math.ceil((reset - Date.now()) / 1000)) },
      }
    )
  }

  // proceed with auth logic
}
Enter fullscreen mode Exit fullscreen mode

Step 8 — Secrets in your env manager, not in code

Three rules:

  1. Never in code — not even encrypted, not even in a comment
  2. Never in git.env.local is gitignored for a reason
  3. Rotate without deploying — secrets change in Vercel's env dashboard, not in a commit
// Wrong — rotating this requires a code change + deploy
const stripe = new Stripe('sk_live_abc123')

// Right — rotating means updating the var in Vercel, nothing else
import { env } from '@/lib/env'
const stripe = new Stripe(env.STRIPE_SECRET_KEY)
Enter fullscreen mode Exit fullscreen mode

Step 9 — Stripe webhook signature verification

If you don't verify the signature, anyone can POST to your webhook endpoint and trigger fake payment events.

// src/app/api/webhooks/stripe/route.ts
import Stripe from 'stripe'
import { env } from '@/lib/env'

const stripe = new Stripe(env.STRIPE_SECRET_KEY)

export async function POST(request: Request) {
  const body = await request.text() // Must be raw text — JSON.parse() breaks the signature
  const signature = request.headers.get('stripe-signature')!

  let event: Stripe.Event
  try {
    event = stripe.webhooks.constructEvent(body, signature, env.STRIPE_WEBHOOK_SECRET)
  } catch {
    return new Response('Invalid signature', { status: 400 })
  }

  switch (event.type) {
    case 'customer.subscription.updated':
      // handle...
      break
  }

  return new Response(null, { status: 200 })
}
Enter fullscreen mode Exit fullscreen mode

Test before every deploy that touches webhook logic:

stripe listen --forward-to localhost:3000/api/webhooks/stripe
stripe trigger customer.subscription.updated
Enter fullscreen mode Exit fullscreen mode

Step 10 — Know your rollback plan before you deploy

Before clicking deploy: if this breaks, what's the first step?

On Vercel:

  1. Dashboard → Deployments
  2. Find the last working deployment
  3. Click "..." → "Promote to Production"

This takes 30 seconds. But you need to know where it is before you're in panic mode at midnight.

Warning: Rolling back code doesn't roll back the database. If your deploy included a migration, rolling back the code leaves the new schema in place. This is why every migration must be backward compatible with the previous version of your code.

Step 11 — Smoke test the critical path

After every deploy, manually run through the one flow that would destroy you if it broke:

  1. Sign up or log in
  2. Do the core action (create a project, submit a form, process a payment)
  3. Verify the outcome (data saved, email sent, webhook fired, UI updated)

This takes 2 minutes. Skip it once and you'll spend 2 hours recovering from the deploy you didn't check.

Step 12 — Alert channel that humans actually see

"Errors go to Sentry" is not an alert strategy if nobody checks Sentry.

Sentry error         → Slack #alerts (immediate)
503 health check     → PagerDuty or email (immediate)
Stripe webhook fail  → Slack #payments (immediate)
Daily summary        → Slack #ops (every morning)
Enter fullscreen mode Exit fullscreen mode

The full deploy sequence

1.  Merge PR to main
2.  CI: lint → typecheck → build (validates env vars)
3.  CI: database migrations
4.  Vercel auto-deploys
5.  Smoke test critical path (2 minutes)
6.  Check Sentry for new errors (first 10 minutes)
7.  Feature flag ON → 5% of users
8.  Monitor 30 minutes
9.  Roll out to 100% — or rollback
Enter fullscreen mode Exit fullscreen mode

Automate the checklist

# .github/workflows/deploy.yml
name: Deploy

on:
  push:
    branches: [main]

jobs:
  check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '22'
      - run: npm ci
      - run: npm run typecheck
      - run: npm run lint
      - run: npm run build

  migrate:
    needs: check
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npm ci
      - run: npx drizzle-kit migrate
        env:
          DATABASE_URL: ${{ secrets.DATABASE_URL }}
Enter fullscreen mode Exit fullscreen mode

migrate runs after check passes. Vercel deploys after the push — by then, migrations are already applied. Migrations always run first, automatically.

What juniors skip (and why it hurts)

Skip Consequence
Env validation undefined reads silently, crashes at runtime
Migration order New code breaks on old schema during deploy window
Feature flags Real users are your QA team
Health check Outages discovered by users, not monitors
Rate limiting on auth Login brute-forced while you sleep
Stripe signature Anyone can fire fake payment events
Rollback plan Panic decisions under pressure
Smoke test Broken flow discovered by your best customer

This checklist is 5 minutes before a deploy that saves 5 hours after one. Seniors run it on every push — even the "it's just a typo fix" ones. Especially those.

Full guide with t3-env setup, Drizzle migrations, and GitHub Actions workflow:
https://stacknotice.com/blog/senior-dev-production-checklist-2026

Top comments (0)