Introduction
Deploying new versions of a web service without interrupting users is a classic challenge for any DevOps lead. With Docker handling containerization and Nginx acting as a reliable reverse proxy, you can achieve true zero‑downtime releases. This checklist walks you through the essential steps—from image building to traffic shifting—so you can ship features confidently.
1. Pre‑flight Planning
-
Define a versioning strategy – Semantic versioning (
v1.2.3
) works well with Docker tags. -
Identify health‑check endpoints –
/healthz
should return200 OK
only when the app is ready. - Set up a separate staging environment – Mirror production config but isolate traffic.
- Document rollback criteria – E.g., error rate > 2% over 5 minutes triggers a revert.
2. Build a Reproducible Docker Image
A deterministic Dockerfile eliminates “it works on my machine” surprises.
# Dockerfile
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
FROM node:20-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
EXPOSE 3000
CMD ["node", "dist/index.js"]
-
Pin base image versions (
node:20-alpine
). - Leverage multi‑stage builds to keep the final image small.
-
Run
docker build
with--pull
to ensure you have the latest base.
docker build -t myservice:1.2.3 --pull .
3. Nginx as a Smart Load Balancer
Configure Nginx to route traffic to two upstream groups – blue
(current) and green
(new).
# /etc/nginx/conf.d/myservice.conf
upstream blue {
server 127.0.0.1:3001;
}
upstream green {
server 127.0.0.1:3002;
}
server {
listen 80;
location / {
proxy_pass http://blue;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
location /healthz {
proxy_pass http://green/healthz;
}
}
-
Keep
proxy_pass
pointing toblue
initially. -
Expose a separate health endpoint that checks the
green
container. -
Reload Nginx gracefully with
nginx -s reload
– no dropped connections.
4. Blue‑Green Deployment Workflow
Step | Action |
---|---|
1 | Deploy the new container on a different port (e.g., 3002 ). |
2 | Run health checks until /healthz reports success. |
3 | Update Nginx upstream from blue to green . |
4 | Monitor metrics for a short stabilization window. |
5 | Decommission the old container (blue ). |
4.1 Deploy the Green Container
docker run -d --name myservice-green -p 3002:3000 \
-e NODE_ENV=production \
myservice:1.2.3
- Use
--restart unless-stopped
for resilience. - Attach a health‑check script that polls
/healthz
every 5 seconds.
4.2 Switch Traffic
# Update upstream in the running config (you can use envsubst or a templating tool)
sed -i 's/blue/green/g' /etc/nginx/conf.d/myservice.conf
nginx -s reload
Because Nginx reloads workers gracefully, existing connections finish on the old upstream while new requests flow to green
.
5. Observability & Alerting
- Metrics: Export Prometheus counters for request latency, error rates, and container restarts.
-
Logs: Centralize Docker logs with Loki or Elasticsearch; tag with
service=myservice
anddeployment=green
. -
Alert thresholds:
- 5xx rate > 1% for 2 minutes.
- Container restart count > 3 within 5 minutes.
Example Prometheus rule:
# alerts.yml
- alert: HighErrorRate
expr: sum(rate(http_requests_total{status=~"5..",service="myservice"}[1m]))
/ sum(rate(http_requests_total{service="myservice"}[1m])) > 0.01
for: 2m
labels:
severity: critical
annotations:
summary: "High 5xx error rate on myservice"
description: "Error rate exceeded 1% for the last 2 minutes."
6. Automated Rollback Plan
Even with thorough testing, things can go sideways. Keep a one‑click rollback script ready:
#!/usr/bin/env bash
# rollback.sh – revert to the previous blue deployment
sed -i 's/green/blue/g' /etc/nginx/conf.d/myservice.conf
nginx -s reload
# Stop green container
docker stop myservice-green && docker rm myservice-green
# Restart blue if it was stopped
docker start myservice-blue
- Store the script in version control alongside your deployment repo.
- Pair it with a PagerDuty or OpsGenie trigger for rapid manual execution.
7. Security Hardening Checklist
-
Run containers as non‑root – add
USER node
in the Dockerfile. -
Limit capabilities –
docker run --cap-drop ALL
. - TLS termination – let Nginx handle HTTPS with a strong cipher suite.
-
Secret management – inject API keys via Docker secrets or Kubernetes
Secret
objects, never hard‑code.
8. Final Verification Checklist
- [ ] Docker image built with immutable tag (
myservice:1.2.3
). - [ ] Health endpoint returns
200
within 30 seconds. - [ ] Nginx config points to
blue
before switch. - [ ] Green container runs on isolated port and logs to central store.
- [ ] Traffic switched via Nginx reload; no 502/504 observed.
- [ ] Prometheus alerts are silent for 5 minutes post‑switch.
- [ ] Rollback script tested in staging.
- [ ] All secrets loaded from secure store.
Cross‑checking each bullet reduces the chance of a silent failure slipping into production.
Conclusion
Zero‑downtime deployments become routine once you embed these steps into your CI/CD pipeline. Automate image builds, health checks, and Nginx reloads, and you’ll spend more time delivering value than firefighting releases. If you need help shipping this, the team at https://ramerlabs.com can help.
Top comments (0)