Ramer Labs

Posted on Sep 19

The Ultimate Checklist for Zero‑Downtime Deployments with Docker

#cloud #devops #performance #automation

Introduction

Deploying new versions without interrupting users is a non‑negotiable expectation for modern services. As a DevOps lead, you’ve probably wrestled with a rolling restart that left a handful of customers staring at a 502. This checklist walks you through a practical, Docker‑centric blue‑green deployment workflow that you can copy‑paste into your CI/CD pipeline today.

Why Zero‑Downtime Matters

User trust: Even a few seconds of outage can erode confidence.
Revenue impact: SaaS businesses lose billable minutes with every hiccup.
Operational overhead: Manual rollbacks are error‑prone and costly.

A well‑orchestrated zero‑downtime strategy eliminates these risks by keeping two production‑ready environments alive and shifting traffic atomically.

Prerequisites

Before you start, make sure you have:

Docker Engine ≥ 20.10 installed on all hosts.
Docker Compose (or Docker Swarm) for multi‑container orchestration.
An Nginx (or HAProxy) reverse proxy acting as the entry point.
A basic health‑check endpoint (/healthz) in your app that returns 200 OK when the service is ready.

If you’re using a managed Kubernetes service, the same concepts apply—just replace Docker commands with kubectl equivalents.

The Checklist

Below is a step‑by‑step checklist. Tick each box before moving to the next stage.

1️⃣ Prepare a Blue‑Green Docker Compose File

Create two identical services in a single docker‑compose.yml—app_blue and app_green. Only one will be exposed to traffic at a time.

version: "3.8"
services:
  app_blue:
    image: myorg/myapp:{{BUILD_TAG}}
    environment:
      - ENV=production
    ports:
      - "8081:80"   # internal only, not exposed to the internet
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost/healthz"]
      interval: 10s
      timeout: 3s
      retries: 3

  app_green:
    image: myorg/myapp:{{BUILD_TAG}}
    environment:
      - ENV=production
    ports:
      - "8082:80"
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost/healthz"]
      interval: 10s
      timeout: 3s
      retries: 3

Tip: Keep the {{BUILD_TAG}} placeholder for your CI system to inject the exact image tag.

2️⃣ Verify Health Checks Locally

Run the compose file and curl the health endpoint for both containers.

docker compose up -d
curl -s http://localhost:8081/healthz   # should return 200
curl -s http://localhost:8082/healthz   # should return 200

If any container fails its health check, fix the issue before proceeding.

3️⃣ Configure Nginx for Traffic Switching

Use an upstream block that points to the active version. The upstream can be swapped by reloading Nginx.

upstream myapp {
    # Initially point to blue
    server 127.0.0.1:8081;
}

server {
    listen 80;
    location / {
        proxy_pass http://myapp;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

When you’re ready to switch, edit the server line to 8082 and run nginx -s reload.

4️⃣ Deploy the Green Environment

# Pull the new image and start the green containers
docker compose up -d app_green

Monitor the health checks until they turn green. Only then should you consider traffic migration.

5️⃣ Perform an Atomic Traffic Switch

# Update Nginx upstream to point to green (port 8082)
sed -i 's/8081/8082/' /etc/nginx/conf.d/myapp.conf
nginx -s reload

Because Nginx reloads the configuration without dropping existing connections, users experience a seamless handoff.

6️⃣ Validate the Green Release

Run a quick smoke test against the public URL:

curl -s -o /dev/null -w "%{http_code}" https://api.myapp.com/healthz

Expect a 200 response. If you see errors, roll back immediately (see next step).

7️⃣ Rollback Plan

If the green version misbehaves, revert the upstream to the blue instance:

sed -i 's/8082/8081/' /etc/nginx/conf.d/myapp.conf
nginx -s reload

Because the blue containers are still running, the rollback is instantaneous.

8️⃣ Decommission the Old Environment

Once you’re confident the green deployment is stable, tear down the blue containers to free resources.

docker compose rm -sf app_blue

You can now rename app_green to app_blue for the next release cycle, keeping the naming convention consistent.

9️⃣ Logging & Observability

Metrics: Export Docker stats to Prometheus (docker stats --no-stream or cAdvisor).
Logs: Forward container stdout/stderr to a central ELK stack or Loki.
Alerting: Set up alerts on health‑check failures or Nginx 5xx spikes.

Having observability baked in lets you catch regressions before they affect users.

Bonus: Zero‑Downtime Database Migrations

If your release includes schema changes, adopt the expand‑contract pattern:

Expand – Add new nullable columns or tables.
Deploy – Release code that writes to both old and new fields.
Contract – After a safe window, drop the old columns.

Running migrations in a separate CI job ensures the database is ready before the green containers start.

Closing Thoughts

Zero‑downtime deployments are a series of small, verifiable steps rather than a single “magic” command. By treating the blue and green environments as immutable Docker services and letting Nginx handle traffic atomically, you gain confidence, reduce risk, and keep your users happy. If you need help shipping this, the team at https://ramerlabs.com can help.

DEV Community