DEV Community

Cover image for Zero-Config DNS and Monitoring for Your Traefik Homelab
Luc Allaire
Luc Allaire

Posted on

Zero-Config DNS and Monitoring for Your Traefik Homelab

Every Traefik service you expose already has a Host() rule that declares its public hostname. That information exists exactly once — in a Docker label — and propagates nowhere useful.

So you end up maintaining three or four systems by hand: Cloudflare for public DNS, NetBird for internal VPN-only hostnames, Uptime Kuma for monitoring — with groups, tags, and status pages configured per service. Add a container and you need to update everything manually. Remove it 4 months later and those records stay unless you remember to clean them up.

traefik-mesh-companion makes the container definition the single source of truth and syncs the rest automatically.

What It Does

A Go sidecar that watches the Docker socket and syncs your Traefik routing labels to:

  • NetBird — internal mesh VPN DNS records
  • Cloudflare — A records or CNAMEs to a CF Tunnel endpoint
  • Uptime Kuma — monitors, status page groups, tags, domain bindings
  • Gatus (via Gatus Bridge) — endpoints and groups

A single Docker Compose sidecar. No Kubernetes, no Helm, no operator.

How It Works

Split-Horizon DNS via Entrypoints

No new label namespace for DNS routing. Two env vars filter your existing entrypoint labels:

INTERNAL_FILTER=internal   # routers on this entrypoint → NetBird
EXTERNAL_FILTER=https      # routers on this entrypoint → Cloudflare
Enter fullscreen mode Exit fullscreen mode

Your existing Traefik labels stay exactly as-is:

# Matches INTERNAL_FILTER → NetBird only
traefik.http.routers.dashboard.rule: "Host(`dashboard.internal.example.com`)"
traefik.http.routers.dashboard.entrypoints: internal

# Matches EXTERNAL_FILTER → Cloudflare only
traefik.http.routers.api.rule: "Host(`api.example.com`)"
traefik.http.routers.api.entrypoints: https
Enter fullscreen mode Exit fullscreen mode

Force-override per container if needed:

mesh.dns.internal: "false"            # exclude from internal pipeline
mesh.routers.admin.managed: "false"   # exclude this router from everything
Enter fullscreen mode Exit fullscreen mode

The Rule Parser

Pure-Go regex AST. Handles compound rules:

(Host(`a.example.com`) || Host(`b.example.com`)) && PathPrefix(`/v2`)
Enter fullscreen mode Exit fullscreen mode

Both hostnames extracted for DNS. PathPrefix captured separately for monitor URL construction. HostRegexp intentionally skipped — you can't derive a static DNS record from a dynamic pattern.

Monitoring Label Hierarchy

The mesh.routers.* namespace sits outside Traefik's schema validator. Fallback hierarchy:

mesh.routers.<router_name>.kuma.<property>   ← highest priority
mesh.routers.<router_name>.<property>
mesh.kuma.<property>
mesh.<property>                              ← lowest priority
Enter fullscreen mode Exit fullscreen mode

Real example:

traefik.http.routers.api.rule: "Host(`api.example.com`)"
traefik.http.routers.api.entrypoints: https

mesh.routers.api.kuma.url: "/health"
mesh.routers.api.kuma.accepted_status_codes: "200, 204"
mesh.routers.api.kuma.interval: "30"
mesh.kuma.tags: "backend, prod:green"
mesh.kuma.pages: "public-status:APIs"
Enter fullscreen mode Exit fullscreen mode

Tags use djb2 deterministic hashing — same tag name always maps to the same color across nodes and restarts. Override with hex: prod:#22c55e.

Quick Start

services:
  mesh-companion:
    image: ghcr.io/wolf-infra/traefik-mesh-companion:stable
    container_name: traefik-mesh-companion
    restart: unless-stopped
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
    environment:
      - SYNC_INTERVAL=1m
      - LOG_LEVEL=info
      # Internal (NetBird)
      - INTERNAL_PROVIDER=netbird
      - INTERNAL_FILTER=internal
      - INTERNAL_CLEANUP=true
      - NETBIRD_API_TOKEN=your_netbird_token
      - NETBIRD_TARGET_IP=100.64.0.5
      # External (Cloudflare)
      - EXTERNAL_PROVIDER=cloudflare
      - EXTERNAL_FILTER=https
      - EXTERNAL_CLEANUP=true
      - CLOUDFLARE_API_TOKEN=your_cf_token
      - CLOUDFLARE_TARGET_DOMAIN=your-tunnel-uuid.cfargotunnel.com
      # Monitoring (Uptime Kuma)
      - MONITOR_PROVIDER=kuma
      - KUMA_URL=http://kuma.example.com
      - KUMA_USERNAME=admin
      - KUMA_PASSWORD=${KUMA_PASS}
      - KUMA_AUTO_ENABLE=true
      - KUMA_GLOBAL_STATUS_PAGE=home-lab
Enter fullscreen mode Exit fullscreen mode

Use stable — it tracks the latest release. latest tracks main and is explicitly experimental.

Advanced: Distributed Coordinator

Running multiple edge nodes writing to one Uptime Kuma? They face race conditions on status page writes — both read current state, both modify it and last write ends up stomping the other's changes.

The companion ships a built-in Distributed Coordinator. One node is the server. Clients provision monitors locally and forward status page attachment operations to the server for sequential processing.

# Primary node
- KUMA_COORDINATOR_MODE=server
- KUMA_COORDINATOR_PORT=8081

# Other nodes
- KUMA_COORDINATOR_MODE=client
- KUMA_COORDINATOR_URL=http://primary:8081
Enter fullscreen mode Exit fullscreen mode

No external queue. Stateless — clients resend full state on reconnect.

Try It

GitHub: github.com/wolf-infra/traefik-mesh-companion

Full env var reference, label override docs, and Gatus Bridge config are in the README. The core.Processor interface makes adding new DNS or monitoring backends straightforward — PRs welcome.

Additional DNS backends are in development — the core.Processor interface is designed for exactly this. PRs welcome.

Top comments (0)