DEV Community

Vigilmon
Vigilmon

Posted on

Monitor Docker containers with multi-region uptime checks - no false alerts

Running Docker containers in production means your app is portable, reproducible, and (hopefully) always up. But "always up" requires monitoring — and most single-probe tools create more noise than signal.

This post shows how to add multi-region uptime monitoring to Dockerized apps so you get paged when your container is actually down, not when a single probe has a bad day.

The false alert problem with Docker monitoring

Single-probe tools work like this: one server pings your endpoint every minute. If it times out, you get an alert. If that one server has a routing issue? You get an alert. If your ISP hiccups for 10 seconds? Alert. On-call at 3 AM for a problem that fixed itself? Alert.

Multi-region monitoring solves this with consensus: multiple probes in different geographic regions must agree that your container is unreachable before firing an alert. One region's blip is ignored. Actual downtime triggers immediately.

Vigilmon uses this approach — free tier, no credit card.

Step 1: Add a health check to your Docker container

Docker has built-in health checks. Add one to your Dockerfile:

FROM node:20-slim
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
EXPOSE 3000

# Health check: curl the /health endpoint every 30s
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
  CMD curl -f http://localhost:3000/health || exit 1

CMD ["node", "server.js"]
Enter fullscreen mode Exit fullscreen mode

The HEALTHCHECK instruction tells Docker to test your container's health. If it fails 3 times consecutively, Docker marks the container as unhealthy — useful for orchestrators like Kubernetes or ECS to restart it automatically.

Step 2: Set up a health endpoint in your app

Your app needs a /health route that returns 200 when healthy:

// Node.js/Express example
app.get('/health', (req, res) => {
  res.json({ status: 'ok', timestamp: new Date().toISOString() });
});
Enter fullscreen mode Exit fullscreen mode
# Python/Flask example
@app.route('/health')
def health():
    return {"status": "ok"}, 200
Enter fullscreen mode Exit fullscreen mode
// Go example
http.HandleFunc("/health", func(w http.ResponseWriter, r *http.Request) {
    w.Header().Set("Content-Type", "application/json")
    json.NewEncoder(w).Encode(map[string]string{"status": "ok"})
})
Enter fullscreen mode Exit fullscreen mode

Step 3: docker-compose health check example

For local dev and staging environments with docker-compose:

version: '3.8'
services:
  api:
    build: .
    ports:
      - "3000:3000"
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
      interval: 30s
      timeout: 5s
      retries: 3
      start_period: 15s
    restart: unless-stopped

  db:
    image: postgres:16-alpine
    environment:
      POSTGRES_PASSWORD: secret
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 10s
      timeout: 5s
      retries: 5
Enter fullscreen mode Exit fullscreen mode

The depends_on with condition: service_healthy ensures your API only starts after Postgres is ready:

  api:
    depends_on:
      db:
        condition: service_healthy
Enter fullscreen mode Exit fullscreen mode

Step 4: Add external multi-region monitoring with Vigilmon

Docker's built-in health check only monitors from within the host. It won't catch:

  • Network routing issues between your users and the server
  • DNS failures
  • SSL certificate problems
  • The entire host going down

That's where external monitoring comes in.

  1. Go to vigilmon.online and sign up free
  2. Click Add Monitor
  3. Enter your container's public URL and health endpoint
  4. Set check interval to 1 minute
  5. Vigilmon probes from multiple regions — if 2+ agree it's down, you get alerted

Step 5: Why multi-region matters for containers

Containers get restarted, redeployed, and migrated. During a rolling update, your container might be briefly unreachable to one region's probe but fine everywhere else. A single-probe tool calls that downtime. Multi-region consensus calls it a deploy.

Real downtime looks different: all regions see the same failure, consistently. That's what Vigilmon flags.

What the free tier includes

  • Unlimited monitors
  • 1-minute check intervals
  • Multi-region probes
  • Slack + email alerts
  • 90-day history charts

No credit card. No 14-day trial. Just sign up at vigilmon.online.

Summary

Production-ready Docker monitoring has two layers:

  1. Internal — Docker HEALTHCHECK for container orchestration and auto-restart
  2. External — Vigilmon for multi-region uptime checks that catch what Docker can't

Add both and you'll stop chasing phantom alerts and start trusting your monitoring.

Top comments (0)