Introduction
When a service goes down for a deploy, users feel the impact instantly—slow pages, broken flows, or outright errors. For modern SaaS products, even a few seconds of unavailability can erode trust and hurt revenue. This checklist walks a DevOps lead through a practical, Docker‑centric workflow that guarantees zero‑downtime deployments behind an Nginx reverse proxy. All steps are actionable, version‑controlled, and observable.
Why Zero‑Downtime Matters
- User experience: Seamless upgrades keep session state alive.
- Business continuity: SLAs often demand 99.9%+ uptime.
- Rollback safety: If something goes wrong, you can instantly revert without affecting live traffic.
Achieving this requires a combination of container orchestration, smart proxying, and robust health‑checking. The checklist below assumes you already run Docker Engine (or Docker Desktop) on a Linux host and use Nginx as a front‑end load balancer.
Prerequisites
Docker Engine
- Docker ≥ 20.10 installed.
-
docker compose
(v2) available on the PATH. - A private registry (Docker Hub, GitHub Packages, or self‑hosted) for versioned images.
Nginx Reverse Proxy
- Nginx ≥ 1.21 compiled with the
stream
andhttp
modules. - TLS certificates in place (Let’s Encrypt or internal PKI).
- Access to edit
/etc/nginx/conf.d/
and reload the service without dropping connections.
The Checklist
✅ | Item | Why it matters |
---|---|---|
1 | Versioned Docker images | Guarantees repeatable rollouts; you can pin a specific SHA. |
2 | Immutable infrastructure | Containers never mutate at runtime; configuration lives in code. |
3 | Blue‑Green service definitions | Two parallel stacks (blue & green) let you switch traffic atomically. |
4 | Health‑check endpoints | Nginx only routes to containers that report healthy. |
5 | Graceful shutdown hooks | Allows in‑flight requests to finish before containers stop. |
6 | Zero‑downtime Nginx reload |
nginx -s reload swaps configs without closing sockets. |
7 | Observability stack | Metrics & logs confirm the new version is healthy before cut‑over. |
8 | Rollback plan | A one‑command revert if health checks fail after traffic switch. |
Below each item is a concrete action you can copy‑paste into your repo.
1. Build and Tag a Release‑Ready Image
# Build the app image and tag it with the git SHA
GIT_SHA=$(git rev-parse --short HEAD)
docker build -t myorg/api:$GIT_SHA .
# Push to registry for later pull by the green stack
docker push myorg/api:$GIT_SHA
Tip: Automate this in your CI pipeline and store the SHA as an environment variable for the next steps.
2. Define Blue & Green Stacks in Docker Compose
version: "3.9"
services:
api-blue:
image: myorg/api:${BLUE_SHA}
restart: always
ports: [ "8001:80" ]
healthcheck:
test: [ "CMD", "curl", "-f", "http://localhost/health" ]
interval: 10s
timeout: 2s
retries: 3
api-green:
image: myorg/api:${GREEN_SHA}
restart: always
ports: [ "8002:80" ]
healthcheck:
test: [ "CMD", "curl", "-f", "http://localhost/health" ]
interval: 10s
timeout: 2s
retries: 3
Only one stack is active at a time. The other stays idle (or runs a previous version) ready for a quick switch.
3. Wire Nginx to Both Stacks
Create /etc/nginx/conf.d/api_upstream.conf
:
upstream api_backend {
# Blue stack – default traffic
server 127.0.0.1:8001 max_fails=3 fail_timeout=30s;
# Green stack – initially marked as backup
server 127.0.0.1:8002 backup;
}
server {
listen 443 ssl http2;
server_name api.example.com;
ssl_certificate /etc/ssl/certs/example.crt;
ssl_certificate_key /etc/ssl/private/example.key;
location / {
proxy_pass http://api_backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}
- The
backup
flag tells Nginx to use the green server only when the blue server is marked unhealthy. - After a successful green deployment you will edit the file to make green primary and reload Nginx.
4. Deploy the Green Stack
# Export the new SHA (set by CI)
export GREEN_SHA=abcdef1
# Bring up the green stack without touching the blue one
docker compose -f docker-compose.yml up -d api-green
Verify health:
docker exec $(docker ps -q -f name=api-green) curl -s http://localhost/health
If the endpoint returns 200 OK
, the green containers are ready.
5. Switch Traffic Atomically
-
Edit the upstream config – remove
backup
from the green server and add it to blue. - Reload Nginx without dropping connections:
sudo nginx -s reload
Nginx now routes new connections to the green stack while existing sessions on blue finish gracefully.
6. Graceful Shutdown of the Blue Stack
# Tell Docker to stop accepting new connections but keep existing ones alive
docker compose -f docker-compose.yml stop -t 30 api-blue
# After 30 seconds, containers are removed
docker compose -f docker-compose.yml rm -f api-blue
The -t 30
flag gives in‑flight requests a 30‑second window to complete.
7. Verify Observability
-
Metrics: Hook Prometheus to scrape
/metrics
from both stacks. Compare latency and error rates before and after the switch. -
Logs: Ship container logs to Loki or Elastic via the Docker logging driver. Search for
ERROR
orpanic
after the cut‑over. - Alerting: Configure an alert that fires if the green stack health check fails for more than two consecutive intervals.
8. Rollback (If Needed)
If the green stack shows anomalies, revert with a single command:
# Re‑make blue primary in Nginx config and reload
sudo sed -i 's/server 127.0.0.1:8001.*;/server 127.0.0.1:8001;/' /etc/nginx/conf.d/api_upstream.conf
sudo sed -i 's/server 127.0.0.1:8002 backup;/server 127.0.0.1:8002 backup;/' /etc/nginx/conf.d/api_upstream.conf
sudo nginx -s reload
The traffic instantly flows back to the known‑good version, and you can investigate the green release offline.
Wrap‑Up
Zero‑downtime deployments are not a magic button; they are a disciplined set of practices that you can codify and version. By keeping Docker images immutable, using a blue‑green pattern, and letting Nginx handle graceful traffic switches, you eliminate the most common sources of downtime. Pair this workflow with health checks, observability, and a clear rollback path, and you’ll meet even the strictest SLA requirements.
If you need help shipping this, the team at https://ramerlabs.com can help.
Top comments (0)