DEV Community

Ramer Labs
Ramer Labs

Posted on

The Ultimate Checklist for Zero‑Downtime Deploys with Docker & Nginx

Introduction

When a service goes down for a deploy, users feel the impact instantly—slow pages, broken flows, or outright errors. For modern SaaS products, even a few seconds of unavailability can erode trust and hurt revenue. This checklist walks a DevOps lead through a practical, Docker‑centric workflow that guarantees zero‑downtime deployments behind an Nginx reverse proxy. All steps are actionable, version‑controlled, and observable.


Why Zero‑Downtime Matters

  • User experience: Seamless upgrades keep session state alive.
  • Business continuity: SLAs often demand 99.9%+ uptime.
  • Rollback safety: If something goes wrong, you can instantly revert without affecting live traffic.

Achieving this requires a combination of container orchestration, smart proxying, and robust health‑checking. The checklist below assumes you already run Docker Engine (or Docker Desktop) on a Linux host and use Nginx as a front‑end load balancer.


Prerequisites

Docker Engine

  • Docker ≥ 20.10 installed.
  • docker compose (v2) available on the PATH.
  • A private registry (Docker Hub, GitHub Packages, or self‑hosted) for versioned images.

Nginx Reverse Proxy

  • Nginx ≥ 1.21 compiled with the stream and http modules.
  • TLS certificates in place (Let’s Encrypt or internal PKI).
  • Access to edit /etc/nginx/conf.d/ and reload the service without dropping connections.

The Checklist

Item Why it matters
1 Versioned Docker images Guarantees repeatable rollouts; you can pin a specific SHA.
2 Immutable infrastructure Containers never mutate at runtime; configuration lives in code.
3 Blue‑Green service definitions Two parallel stacks (blue & green) let you switch traffic atomically.
4 Health‑check endpoints Nginx only routes to containers that report healthy.
5 Graceful shutdown hooks Allows in‑flight requests to finish before containers stop.
6 Zero‑downtime Nginx reload nginx -s reload swaps configs without closing sockets.
7 Observability stack Metrics & logs confirm the new version is healthy before cut‑over.
8 Rollback plan A one‑command revert if health checks fail after traffic switch.

Below each item is a concrete action you can copy‑paste into your repo.


1. Build and Tag a Release‑Ready Image

# Build the app image and tag it with the git SHA
GIT_SHA=$(git rev-parse --short HEAD)
docker build -t myorg/api:$GIT_SHA .
# Push to registry for later pull by the green stack
docker push myorg/api:$GIT_SHA
Enter fullscreen mode Exit fullscreen mode

Tip: Automate this in your CI pipeline and store the SHA as an environment variable for the next steps.


2. Define Blue & Green Stacks in Docker Compose

version: "3.9"
services:
  api-blue:
    image: myorg/api:${BLUE_SHA}
    restart: always
    ports: [ "8001:80" ]
    healthcheck:
      test: [ "CMD", "curl", "-f", "http://localhost/health" ]
      interval: 10s
      timeout: 2s
      retries: 3
  api-green:
    image: myorg/api:${GREEN_SHA}
    restart: always
    ports: [ "8002:80" ]
    healthcheck:
      test: [ "CMD", "curl", "-f", "http://localhost/health" ]
      interval: 10s
      timeout: 2s
      retries: 3
Enter fullscreen mode Exit fullscreen mode

Only one stack is active at a time. The other stays idle (or runs a previous version) ready for a quick switch.


3. Wire Nginx to Both Stacks

Create /etc/nginx/conf.d/api_upstream.conf:

upstream api_backend {
    # Blue stack – default traffic
    server 127.0.0.1:8001 max_fails=3 fail_timeout=30s;
    # Green stack – initially marked as backup
    server 127.0.0.1:8002 backup;
}

server {
    listen 443 ssl http2;
    server_name api.example.com;
    ssl_certificate /etc/ssl/certs/example.crt;
    ssl_certificate_key /etc/ssl/private/example.key;

    location / {
        proxy_pass http://api_backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }
}
Enter fullscreen mode Exit fullscreen mode
  • The backup flag tells Nginx to use the green server only when the blue server is marked unhealthy.
  • After a successful green deployment you will edit the file to make green primary and reload Nginx.

4. Deploy the Green Stack

# Export the new SHA (set by CI)
export GREEN_SHA=abcdef1
# Bring up the green stack without touching the blue one
docker compose -f docker-compose.yml up -d api-green
Enter fullscreen mode Exit fullscreen mode

Verify health:

docker exec $(docker ps -q -f name=api-green) curl -s http://localhost/health
Enter fullscreen mode Exit fullscreen mode

If the endpoint returns 200 OK, the green containers are ready.


5. Switch Traffic Atomically

  1. Edit the upstream config – remove backup from the green server and add it to blue.
  2. Reload Nginx without dropping connections:
sudo nginx -s reload
Enter fullscreen mode Exit fullscreen mode

Nginx now routes new connections to the green stack while existing sessions on blue finish gracefully.


6. Graceful Shutdown of the Blue Stack

# Tell Docker to stop accepting new connections but keep existing ones alive
docker compose -f docker-compose.yml stop -t 30 api-blue
# After 30 seconds, containers are removed
docker compose -f docker-compose.yml rm -f api-blue
Enter fullscreen mode Exit fullscreen mode

The -t 30 flag gives in‑flight requests a 30‑second window to complete.


7. Verify Observability

  • Metrics: Hook Prometheus to scrape /metrics from both stacks. Compare latency and error rates before and after the switch.
  • Logs: Ship container logs to Loki or Elastic via the Docker logging driver. Search for ERROR or panic after the cut‑over.
  • Alerting: Configure an alert that fires if the green stack health check fails for more than two consecutive intervals.

8. Rollback (If Needed)

If the green stack shows anomalies, revert with a single command:

# Re‑make blue primary in Nginx config and reload
sudo sed -i 's/server 127.0.0.1:8001.*;/server 127.0.0.1:8001;/' /etc/nginx/conf.d/api_upstream.conf
sudo sed -i 's/server 127.0.0.1:8002 backup;/server 127.0.0.1:8002 backup;/' /etc/nginx/conf.d/api_upstream.conf
sudo nginx -s reload
Enter fullscreen mode Exit fullscreen mode

The traffic instantly flows back to the known‑good version, and you can investigate the green release offline.


Wrap‑Up

Zero‑downtime deployments are not a magic button; they are a disciplined set of practices that you can codify and version. By keeping Docker images immutable, using a blue‑green pattern, and letting Nginx handle graceful traffic switches, you eliminate the most common sources of downtime. Pair this workflow with health checks, observability, and a clear rollback path, and you’ll meet even the strictest SLA requirements.

If you need help shipping this, the team at https://ramerlabs.com can help.

Top comments (0)