Here's something that took longer to debug than it should have.
The setup
Running Caddy as a reverse proxy on a systemd-based Linux machine. Cert renewal via ACME. Everything looks fine in the logs. Then one day the cert is expired and nobody noticed for two days.
The cause
systemd-resolved has a behavior where it returns SERVFAIL for specific DNS queries depending on the upstream resolver situation. It's not consistent. Some zones resolve fine. Some silently fail. Caddy's ACME client sends the challenge request, systemd-resolved reports a failure, and the renewal just... doesn't happen.
What makes this annoying is that systemd-resolve --status shows nothing wrong. dig might work fine against 8.8.8.8. The stub resolver is the one lying to your application, and it doesn't log it anywhere useful.
The fix
Three ways to deal with it:
1. Bypass the stub resolver
Point Caddy (or Go's net stack generally) at a public resolver directly. In your Caddyfile:
{
servers :443 {
dns resolver 1.1.1.1
}
}
Or set GODEBUG=netdns=go to force the Go resolver instead of trusting the system resolver configuration.
2. Restart systemd-resolved
systemctl restart systemd-resolved clears out whatever broken state it accumulated. This is a temporary fix — you'll hit it again.
More permanently, check /etc/resolv.conf and make sure you're not relying on the stub resolver for everything.
3. Use DNS-over-HTTPS
If you want to stay with resolved but make it less fragile, configure it to use DoH upstream instead of plain UDP. Won't solve the SERVFAIL case but avoids a class of MITM issues.
The symptom worth knowing
The specific symptom: Caddy logs say renewal failed but give no obvious reason. caddy list shows the cert is expiring soon. Everything else keeps working. Browsers cache cert expiry warnings, so users stop complaining — and then it becomes your problem on a Monday morning.
Bottom line
If you're running Caddy on systemd-resolved and your certs are expiring unexpectedly, check the stub resolver before checking anything else. It's the kind of failure that hides in plain sight because "DNS is working."
Not a sponsor. Just something that wasted an afternoon.
Top comments (0)