DEV Community

Guatu
Guatu

Posted on • Originally published at guatulabs.dev

AdGuard Home: Network-Wide DNS Filtering with Failover

DNS is the single point of failure that makes everyone in the house complain that "the internet is down" when, in reality, your DNS container just crashed. I've spent too much time as the sole admin of my network having to manually flip DNS settings on my router because a single AdGuard Home instance decided to stop responding. If you're running this in a homelab, you can't just set it and forget it. You need a failover strategy that doesn't require you to touch a CLI while your family is staring at you.

The mistake most people make is trusting the default upstream behavior. They add three upstream servers and assume AdGuard Home will magically route around a dead one instantly. In practice, depending on your version and config, you can still hit timeouts that feel like a total outage. I've moved my setup to a Kubernetes deployment using MetalLB to give it a static IP, but the real win is the explicit failover logic in the adguard-home.yaml.

I prefer using a combination of Cloudflare and Quad9 for the primary upstreams, with a dedicated fallback. This ensures that if my primary DNS providers have a routing issue, the system pivots to a tertiary option without dropping the request.

# adguard-home.yaml snippets
upstream_dns:
  - "1.1.1.1"
  - "1.0.0.1"
  - "9.9.9.9"

dns:
  # Use parallel requests to find the fastest response
  upstream_mode: parallel 

failover:
  enabled: true
  health_check_interval: 30
  health_check_timeout: 10
  fallback_upstream: "8.8.8.8"
Enter fullscreen mode Exit fullscreen mode

For those running this on K8s, don't skimp on memory limits. I initially set my memory request too low and saw the OOM killer terminate the pod every time I updated a large blocklist. I now pin my resources to ensure stability, especially when integrated with cert-manager for automated TLS to secure the dashboard.

helm install adguard-home k8s-at-home/adguard-home \
  --namespace network \
  --create-namespace \
  --set image.tag=latest \
  --set resources.limits.memory=1Gi \
  --set resources.requests.memory=256Mi
Enter fullscreen mode Exit fullscreen mode

The biggest lesson here is that "high availability" for DNS isn't just about having two pods. It's about how the system handles the gap between a server being "up" and a server actually returning a valid record. If you're building out larger infrastructure, I've found that combining this with a strict manifest validation pipeline prevents the kind of YAML typos that can take your entire network offline.

Keep your upstreams diverse and your memory limits realistic.

Top comments (0)