Running Docker containers in production means your app is portable, reproducible, and (hopefully) always up. But "always up" requires monitoring — and most single-probe tools create more noise than signal.
This post shows how to add multi-region uptime monitoring to Dockerized apps so you get paged when your container is actually down, not when a single probe has a bad day.
The false alert problem with Docker monitoring
Single-probe tools work like this: one server pings your endpoint every minute. If it times out, you get an alert. If that one server has a routing issue? You get an alert. If your ISP hiccups for 10 seconds? Alert. On-call at 3 AM for a problem that fixed itself? Alert.
Multi-region monitoring solves this with consensus: multiple probes in different geographic regions must agree that your container is unreachable before firing an alert. One region's blip is ignored. Actual downtime triggers immediately.
Vigilmon uses this approach — free tier, no credit card.
Step 1: Add a health check to your Docker container
Docker has built-in health checks. Add one to your Dockerfile:
FROM node:20-slim
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
EXPOSE 3000
# Health check: curl the /health endpoint every 30s
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
CMD curl -f http://localhost:3000/health || exit 1
CMD ["node", "server.js"]
The HEALTHCHECK instruction tells Docker to test your container's health. If it fails 3 times consecutively, Docker marks the container as unhealthy — useful for orchestrators like Kubernetes or ECS to restart it automatically.
Step 2: Set up a health endpoint in your app
Your app needs a /health route that returns 200 when healthy:
// Node.js/Express example
app.get('/health', (req, res) => {
res.json({ status: 'ok', timestamp: new Date().toISOString() });
});
# Python/Flask example
@app.route('/health')
def health():
return {"status": "ok"}, 200
// Go example
http.HandleFunc("/health", func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(map[string]string{"status": "ok"})
})
Step 3: docker-compose health check example
For local dev and staging environments with docker-compose:
version: '3.8'
services:
api:
build: .
ports:
- "3000:3000"
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
interval: 30s
timeout: 5s
retries: 3
start_period: 15s
restart: unless-stopped
db:
image: postgres:16-alpine
environment:
POSTGRES_PASSWORD: secret
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 10s
timeout: 5s
retries: 5
The depends_on with condition: service_healthy ensures your API only starts after Postgres is ready:
api:
depends_on:
db:
condition: service_healthy
Step 4: Add external multi-region monitoring with Vigilmon
Docker's built-in health check only monitors from within the host. It won't catch:
- Network routing issues between your users and the server
- DNS failures
- SSL certificate problems
- The entire host going down
That's where external monitoring comes in.
- Go to vigilmon.online and sign up free
- Click Add Monitor
- Enter your container's public URL and health endpoint
- Set check interval to 1 minute
- Vigilmon probes from multiple regions — if 2+ agree it's down, you get alerted
Step 5: Why multi-region matters for containers
Containers get restarted, redeployed, and migrated. During a rolling update, your container might be briefly unreachable to one region's probe but fine everywhere else. A single-probe tool calls that downtime. Multi-region consensus calls it a deploy.
Real downtime looks different: all regions see the same failure, consistently. That's what Vigilmon flags.
What the free tier includes
- Unlimited monitors
- 1-minute check intervals
- Multi-region probes
- Slack + email alerts
- 90-day history charts
No credit card. No 14-day trial. Just sign up at vigilmon.online.
Summary
Production-ready Docker monitoring has two layers:
-
Internal — Docker
HEALTHCHECKfor container orchestration and auto-restart - External — Vigilmon for multi-region uptime checks that catch what Docker can't
Add both and you'll stop chasing phantom alerts and start trusting your monitoring.
Top comments (0)