Ramer Labs

Posted on Sep 19

The Ultimate Checklist for Zero‑Downtime Deploys with Docker and Nginx

#cloud #devops #architecture #performance

Why Zero‑Downtime Deploys Matter

For a DevOps lead, every minute of downtime translates into lost revenue, frustrated users, and a dent in brand trust. Modern users expect services to be available 24/7, and competitors are only a click away. Zero‑downtime deployment strategies—especially when you’re running containerized workloads behind Nginx—let you ship new features, security patches, or configuration changes without interrupting traffic.

In this checklist we’ll walk through a pragmatic, battle‑tested process that combines Docker, Nginx reverse‑proxy tricks, and CI/CD automation. By the end you’ll have a repeatable workflow that you can copy into any microservice or monolith.

Prerequisites

Before you start, make sure you have the following in place:

Docker Engine ≥ 20.10 on your build agents and target hosts.
Docker Compose (optional but handy for local testing).
Nginx 1.21+ acting as a reverse proxy with support for proxy_pass and upstream blocks.
A CI system (GitHub Actions, GitLab CI, or Jenkins) that can push images to a registry.
Health‑check endpoint (/healthz) on your app that returns 200 OK when ready.

If any of these are missing, pause the checklist and get them sorted first. Skipping this step is a common cause of half‑baked deployments.

Step‑by‑Step Checklist

1️⃣ Build an Immutable Docker Image

Write a Dockerfile that copies only what you need and runs as a non‑root user.
Use multi‑stage builds to keep the final image small.
Tag images with both a semantic version and a git SHA for traceability.

# syntax=docker/dockerfile:1.4
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci && npm run build

FROM node:20-alpine AS runtime
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/package*.json ./
RUN npm ci --production && addgroup -S app && adduser -S app -G app
USER app
EXPOSE 3000
CMD ["node", "dist/index.js"]

Verify the image locally with docker run --rm -p 3000:3000 myapp:1.2.3 and hit the health endpoint.

2️⃣ Push the Image to a Registry

docker tag myapp:1.2.3 myregistry.example.com/myapp:1.2.3
docker push myregistry.example.com/myapp:1.2.3

Make sure your CI pipeline has credentials stored as secrets and uses docker login before the push.

3️⃣ Prepare Nginx for Blue‑Green Routing

Create an upstream block that references two logical servers: app_blue and app_green. Only one will be active at a time.

upstream myapp {
    server 127.0.0.1:3001 max_fails=0; # blue
    # server 127.0.0.1:3002 max_fails=0; # green (commented out)
}

server {
    listen 80;
    location / {
        proxy_pass http://myapp;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
    location /healthz {
        proxy_pass http://127.0.0.1:3001/healthz; # health checks target blue by default
    }
}

When you’re ready to flip, comment/uncomment the appropriate line and reload Nginx:

nginx -s reload

4️⃣ Deploy the New Container Side‑by‑Side

Spin up the new version on the inactive port (e.g., 3002 for green).
Use Docker Compose or a simple docker run command that maps the port.

docker run -d \
  --name myapp_green \
  -p 3002:3000 \
  myregistry.example.com/myapp:1.2.4

Verify health: curl http://localhost:3002/healthz. It should return 200.

5️⃣ Smoke Test the Green Instance

Run a quick set of integration tests against the green port. Keep the test suite lightweight—focus on critical paths like authentication, database connectivity, and API response shapes.

npm run test:smoke -- --base-url http://localhost:3002

If any test fails, stop the process, fix the image, and repeat steps 1‑5.

6️⃣ Switch Traffic Atomically

Edit the Nginx upstream block to comment out the blue server and uncomment the green one.
Reload Nginx.

# Edit /etc/nginx/conf.d/myapp.conf
#   comment out: server 127.0.0.1:3001;
#   uncomment:   server 127.0.0.1:3002;
nginx -s reload

Because Nginx reloads gracefully, existing connections finish on the blue instance while new connections start hitting green. This gives you a true zero‑downtime cutover.

7️⃣ Drain and Remove the Old Container

After a safe observation window (e.g., 5‑10 minutes), stop the blue container.

docker stop myapp_blue && docker rm myapp_blue

If you notice any anomalies, you can roll back by re‑enabling the blue server in the upstream block and reloading Nginx.

8️⃣ Log, Observe, and Alert

Logging: Forward container logs to a central system (ELK, Loki, or CloudWatch) using Docker’s --log-driver.
Metrics: Expose Prometheus metrics from your app and scrape them.
Alerting: Set up alerts on health‑check failures, high latency, or error rate spikes.

A minimal Prometheus scrape config for the green instance might look like:

scrape_configs:
  - job_name: 'myapp_green'
    static_configs:
      - targets: ['localhost:3002']

Common Pitfalls to Avoid

Skipping health checks – Nginx will start routing traffic even if the new container is unhealthy.
Hard‑coding ports – Use environment variables or a service discovery layer to avoid port collisions.
Long‑running database migrations – Run them in a separate maintenance window or use feature flags.
Not versioning Nginx configs – Store them in Git; a bad config can bring the whole site down.

Checklist Summary

[ ] Dockerfile follows multi‑stage, non‑root best practices.
[ ] Image tagged with version + SHA and pushed to a secure registry.
[ ] Nginx upstream defines blue and green slots, defaulting to blue.
[ ] New container launched on the inactive port and passes health checks.
[ ] Smoke tests succeed against the new instance.
[ ] Nginx upstream swapped and reloaded.
[ ] Old container drained and removed.
[ ] Logs, metrics, and alerts are wired and verified.

By treating each bullet as a gate, you turn a potentially risky rollout into a repeatable, low‑friction operation.

Final Thoughts

Zero‑downtime deployments are not a mystical art; they are a disciplined series of checks, balances, and automation. With Docker’s immutability, Nginx’s graceful reloads, and a solid CI pipeline, you can ship changes dozens of times a day without ever upsetting your users. If you need help shipping this, the team at https://ramerlabs.com can help.

DEV Community