Ifeanyi Nworji

Posted on Dec 13, 2025

Blue/Green Deployment with Nginx Upstreams Using Docker Compose

#webdev #programming #devops

Auto-Failover, Zero Downtime, and Manual Traffic Switching

One of the core responsibilities of a DevOps engineer is ensuring application availability in the presence of failure. Downtime is rarely caused by deployments themselves, but by how traffic is handled when something goes wrong.

In Stage 2 of my DevOps internship, I implemented a Blue/Green deployment architecture using Nginx upstreams and Docker Compose, focusing on:

Zero failed client requests during outages
Automatic failover within a single request
Manual traffic switching without restarting containers
No application code changes
No image rebuilds

This article is a beginner-friendly but production-accurate walkthrough of the solution, explaining both the configuration and the runtime behavior in detail.

Problem Overview

We are provided with two identical Node.js services packaged as pre-built Docker images:

Blue — primary (active)
Green — backup

Each service exposes the following endpoints:

Endpoint	Purpose
`GET /version`	Returns JSON + headers
`GET /healthz`	Liveness check
`POST /chaos/start`	Simulates failure
`POST /chaos/stop`	Restores service

The task is to place Nginx in front of both services and guarantee:

All traffic goes to Blue by default
On Blue failure, Nginx automatically switches to Green
No client request returns non-200 during failover
Application response headers are forwarded unchanged
Traffic can be manually toggled between Blue and Green

Architecture Overview
The final architecture is intentionally simple and production-aligned:

Client --> Nginx (8080) --> Blue App (8081) OR failover to Green App (8082)

Key characteristics:

Nginx is the single public entrypoint
Blue/Green run simultaneously
Docker Compose orchestrates everything
No Kubernetes, no service mesh, no rebuilds

Environment-Driven Configuration

All behavior is controlled via environment variables, making the setup CI-friendly and reproducible.

Key variables:

BLUE_IMAGE, GREEN_IMAGE
ACTIVE_POOL (blue or green)
RELEASE_ID_BLUE, RELEASE_ID_GREEN
PORT, BLUE_PORT, GREEN_PORT
NGINX_PORT

This design ensures:

No hardcoded values
Safe traffic switching
Easy automated verification

Docker Compose: Service Breakdown

Blue Application Service

app_blue:
  image: ${BLUE_IMAGE}
  container_name: app_blue
  restart: always
  environment:
    - PORT=${PORT}
    - RELEASE_ID=${RELEASE_ID_BLUE}
    - APP_POOL=blue
  expose:
    - "${PORT}"
  ports:
    - "${BLUE_PORT}:${PORT}"
  healthcheck:
    test: ["CMD-SHELL", "node -e \"process.exit(0)\""]
    interval: 5s
    timeout: 2s
    retries: 3

What this achieves:

Runs the provided Blue image without modification
Injects runtime metadata used in response headers
Exposes the service internally to Nginx
Maps a direct port (8081) for chaos testing
Keeps the container healthy and restartable

The Green service is identical, differing only in image, release ID, and port.
This symmetry is critical for Blue/Green deployments.

Nginx Reverse Proxy

nginx:
  image: nginx:latest
  ports:
    - "${NGINX_PORT}:80"
  volumes:
    - ./nginx/nginx.tmpl:/etc/nginx/templates/default.conf.template:ro
    - ./nginx/entrypoint.sh:/docker-entrypoint.d/10-envsubst.sh:ro
    - ./nginx-logs:/var/log/nginx
  environment:
    - ACTIVE_POOL=${ACTIVE_POOL}
    - PORT=${PORT}

Key decisions:

Nginx is the only public interface
Configuration is templated, not static
Logs are persisted for inspection and alerting
No container restarts needed for traffic switching

Nginx Upstreams: Blue/Green Routing

The heart of the solution lies in the Nginx upstream configuration.

Timeout and Retry Configuration

proxy_connect_timeout 1s;
proxy_read_timeout 5s;
proxy_send_timeout 3s;

proxy_next_upstream error timeout http_500 http_502 http_503 http_504;
proxy_next_upstream_tries 2;
proxy_next_upstream_timeout 8s;

These values ensure:

Failures are detected quickly
Retries happen automatically
Total request time remains under 10 seconds
Clients never see partial or failed responses

Primary / Backup Upstreams

upstream blue {
  server app_blue:${PORT} max_fails=1 fail_timeout=5s;
  server app_green:${PORT} backup;
}

upstream green {
  server app_green:${PORT} max_fails=1 fail_timeout=5s;
  server app_blue:${PORT} backup;
}

Why this works:

max_fails=1 marks the primary unhealthy after a single failure
fail_timeout=5s enables fast recovery
backup ensures Green is only used when Blue fails
The same config supports both active pools

Deep Dive: Request Flow During Failure
This is the most important part of the system.

Normal Operation

Client sends:

GET http://localhost:8080/version

Nginx forwards to Blue
Blue responds with 200
Headers returned:

X-App-Pool: blue
X-Release-Id: <RELEASE_ID_BLUE>

Failure Scenario (Blue Down)

Chaos is induced directly on Blue:

POST http://localhost:8081/chaos/start?mode=error

Now let’s trace a single client request.

Step 1: Request Hits Nginx

The client is unaware of Blue or Green.

Step 2: Nginx Proxies to Blue

Blue is the primary upstream.

Step 3: Blue Fails

Blue returns a 5xx or times out.

Step 4: Nginx Intercepts the Failure

Because of:

proxy_next_upstream error timeout http_5xx;

Nginx does not forward the failure to the client.

Step 5: Immediate Retry to Green

Within the same client request, Nginx retries the request to Green.

Step 6: Green Responds Successfully

Green returns:

HTTP 200
X-App-Pool: green
X-Release-Id: <RELEASE_ID_GREEN>

Result:
The client sees HTTP 200, even though Blue failed.

Why Proxy Buffering Matters

proxy_buffering on;

This ensures Nginx does not stream partial responses.
If Blue fails mid-request, Nginx can safely retry Green without exposing errors to clients.

Header Preservation
Each application response includes:

X-App-Pool
X-Release-Id Nginx forwards these headers unchanged:

proxy_pass_header X-App-Pool;
proxy_pass_header X-Release-Id;

This allows:

CI validation

Runtime verification

Clear observability of which pool served the request

Manual Blue/Green Switching

Traffic switching is handled by configuration templating.

Entrypoint Script

envsubst '$ACTIVE_POOL $PORT $RELEASE_ID_BLUE $RELEASE_ID_GREEN' \
  < default.conf.template > default.conf

This allows:

Changing ACTIVE_POOL=green
Regenerating the Nginx config
Reloading Nginx without downtime No containers are restarted.

Stability Under Sustained Failure
During a ~10 second request loop:

Zero non-200 responses
≥95% responses from Green
Blue remains isolated until healthy

This satisfies all grader stability requirements.

Key Takeaways

This project demonstrates:

Blue/Green deployment without Kubernetes
Auto-failover within a single HTTP request
Resilience implemented at the proxy layer
Environment-driven infrastructure design
Production-grade reliability using simple tools

Conclusion

High availability is not about avoiding failure—it’s about handling failure correctly.

By combining Nginx upstreams, Docker Compose, and strict timeout and retry controls, we achieve:

Zero downtime
Safe rollbacks
Transparent failover
CI-ready verification

This approach mirrors real production systems and is an excellent foundation for any DevOps engineer.

If you’re learning DevOps, mastering patterns like this matters far more than chasing tools. Reliability is a design choice.
Explore the code here

DEV Community

Blue/Green Deployment with Nginx Upstreams Using Docker Compose

Top comments (0)