Ramer Labs

Posted on Sep 24

The Ultimate Checklist for Zero‑Downtime Deploys with Docker & Nginx

#nginx #docker #devops #cloud

Introduction

Zero‑downtime deployments are a non‑negotiable expectation for modern services. As a DevOps lead, you need a repeatable process that keeps traffic flowing while you push new code, update configurations, or run database migrations. This checklist walks you through a pragmatic, Docker‑centric workflow that leverages Nginx as a reverse proxy, blue‑green releases, and observability tooling. By the end you’ll have a concrete CI/CD pipeline you can drop into any Linux‑based environment.

Prerequisites

A Linux host (or VM) with Docker Engine ≥ 20.10 installed.
Basic familiarity with Nginx configuration syntax.
Access to a Git repository that houses your application source.
Optional but recommended: a managed PostgreSQL instance for the migration example.

If you’re missing any of these, spin up a cheap cloud VM or use a local Docker Desktop installation before proceeding.

1️⃣ Build a Reproducible Docker Image

Dockerfile Best Practices

Pin base images – use a specific tag, not latest.
Leverage multi‑stage builds to keep the final image lean.
Declare a non‑root user for runtime security.
Expose only the ports you need.

# ---- Build stage ----
FROM node:18-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build

# ---- Runtime stage ----
FROM node:18-alpine
WORKDIR /app
# Create a non‑root user
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/package*.json ./
RUN npm ci --only=production && npm cache clean --force
USER appuser
EXPOSE 3000
CMD ["node", "dist/index.js"]

Build and tag the image with a short Git SHA for traceability:

export GIT_SHA=$(git rev-parse --short HEAD)
docker build -t myapp:${GIT_SHA} .

Push the image to your registry (Docker Hub, ECR, GCR, etc.):

docker push myregistry.example.com/myapp:${GIT_SHA}

2️⃣ Configure Nginx as a Reverse Proxy

Nginx will sit in front of two upstream groups – green and blue – each pointing at a different container version. The proxy will route traffic to the active group while the other stays idle.

# /etc/nginx/conf.d/app.conf
upstream green {
    server 127.0.0.1:3001; # Docker container for green version
}

upstream blue {
    server 127.0.0.1:3002; # Docker container for blue version
}

# Initially point to green
map $http_x_deploy_target $upstream {
    default green;
    "blue" blue;
}

server {
    listen 80;
    server_name app.example.com;

    location / {
        proxy_pass http://$upstream;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

Reload Nginx after any change:

nginx -s reload

3️⃣ Implement a Blue‑Green Deployment Pipeline

Below is a minimal GitHub Actions workflow that:

Builds the Docker image.
Pushes it to the registry.
Deploys the new container to the inactive upstream.
Switches Nginx traffic.
Performs a health‑check before cleaning up the old version.

name: CI/CD Blue‑Green Deploy
on:
  push:
    branches: [ main ]

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v3

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2

      - name: Log in to registry
        uses: docker/login-action@v2
        with:
          registry: myregistry.example.com
          username: ${{ secrets.REGISTRY_USER }}
          password: ${{ secrets.REGISTRY_PASS }}

      - name: Build and push image
        id: build
        run: |
          GIT_SHA=$(git rev-parse --short HEAD)
          docker build -t myregistry.example.com/myapp:${GIT_SHA} .
          docker push myregistry.example.com/myapp:${GIT_SHA}
          echo "image=${GIT_SHA}" >> $GITHUB_OUTPUT

      - name: Deploy to inactive slot
        env:
          IMAGE=${{ steps.build.outputs.image }}
        run: |
          # Determine current slot via Nginx header (simplified)
          CURRENT=$(curl -s -D - http://app.example.com | grep X-Deploy-Target | cut -d: -f2 | tr -d ' ')
          if [ "$CURRENT" = "green" ]; then TARGET=blue; PORT=3002; else TARGET=green; PORT=3001; fi
          docker run -d --name ${TARGET}_app -p ${PORT}:3000 myregistry.example.com/myapp:${IMAGE}
          # Tell Nginx to switch
          curl -X POST -H "X-Deploy-Target: ${TARGET}" http://localhost/switch

      - name: Health check new slot
        run: |
          # Simple curl loop, abort after 30s if unhealthy
          for i in {1..10}; do
            STATUS=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:${PORT})
            if [ "$STATUS" = "200" ]; then exit 0; fi
            sleep 3
          done
          exit 1

      - name: Clean up old container
        run: |
          if [ "$TARGET" = "blue" ]; then docker rm -f green_app; else docker rm -f blue_app; fi

Key takeaways:

The workflow never stops serving traffic.
Switching is performed by a tiny HTTP endpoint (/switch) that updates the $upstream map.
Health checks guard against bad releases.

4️⃣ Zero‑Downtime Database Migrations

Even with flawless container swaps, a schema change can still cause downtime. Follow these patterns:

Add‑only migrations – introduce new columns with defaults, avoid dropping anything.
Backfill in background – use a worker to populate new columns after the code is live.
Feature flags – gate new queries behind a toggle until the migration is verified.

A simple CLI example using node-pg-migrate:

npx node-pg-migrate up --config ./migrate-config.js

migrate-config.js can be version‑controlled and run inside the same Docker image, ensuring the migration runs in the same environment as the app.

5️⃣ Observability & Logging

A zero‑downtime strategy is only as good as the visibility you have during the switch.

Metrics – Prometheus scrapes /metrics from each container. Tag metrics with deployment=green|blue.
Logs – Forward Docker stdout/stderr to Loki via the Docker logging driver.
Tracing – Enable OpenTelemetry in your Node.js app and send spans to Jaeger.
Alerting – Set up Grafana alerts on error‑rate spikes during a deploy.

Example Prometheus scrape config:

scrape_configs:
  - job_name: 'myapp'
    static_configs:
      - targets: ['127.0.0.1:3001', '127.0.0.1:3002']

📋 Final Checklist

✅ Item	Description
Dockerfile	Pin base image, multi‑stage, non‑root user, expose correct port
Image Tagging	Use Git SHA or semantic version for traceability
Nginx Config	Two upstream blocks, map header to `$upstream`, reload after switch
CI/CD Pipeline	Build → Push → Deploy to inactive slot → Health check → Switch → Cleanup
Database Migration	Add‑only, backfill, feature‑flag guarded, run inside container
Observability	Prometheus metrics, Loki logs, OpenTelemetry tracing, Grafana alerts
Rollback Plan	Keep the previous container alive for at least 5 min; `docker rm -f` only after confirming stability

Cross‑checking this list before each release will dramatically reduce the chance of an outage.

Closing Thoughts

Zero‑downtime deployments are a blend of disciplined infrastructure, automated pipelines, and rigorous monitoring. By treating Docker images as immutable artifacts, using Nginx to route traffic between blue and green slots, and embedding health checks into your CI/CD workflow, you can ship changes several times a day without ever taking users offline. If you need help shipping this, the team at https://ramerlabs.com can help.

DEV Community