DEV Community

Cover image for How We Built Our Own DNS Server

How We Built Our Own DNS Server

Jonas Scholz on April 17, 2026

We wrote a production DNS server in ~1000 lines of Go, migrated thousands of records off Hetzner DNS, and dropped propagation time from "up to 90 m...
Collapse
 
wimadev profile image
Lukas Mauser

Next: How we build our own IXPs

Collapse
 
n1try profile image
Ferdinand Mütsch

Super interesting article, thanks for sharing! 🤓 I was only wondering: if your primary goal was to just self-host your authoritative DNS instead of using a managed service, why did you decide to go "all in" and write your server instead of just using CoreDNS, Unbound or the like?

Collapse
 
code42cate profile image
Jonas Scholz

Im a Control freak and didn’t like how a lot of the open source things were built! Especially the APIs, observability and deployment options. Also keep in mind that we really need like 1% of what coredns offers, so building it myself really wasn’t that hard!

Collapse
 
johnnylemonny profile image
𝗝𝗼𝗵𝗻 • Edited

This was a really enjoyable read - thanks for sharing. I love how you walked through the problem and the trade-offs you considered; the write-up feels honest and practical, not just a brag about shipping something cool.

I especially appreciate the hidden primary pattern and the pragmatic use of Postgres LISTEN/NOTIFY to keep things simple and observable.

Thanks again for the clear explanations and the code examples; they make the whole approach feel achievable.

Collapse
 
cyberdaemon profile image
Cyber Daemon

😂 I love the sheer "fine, I'll do it myself" energy of looking at a 90-minute DNS propagation delay and casually writing a thousand lines of Go to replace a core infrastructure pillar before lunch. Using Postgres as an event bus, instead of spinning up Kafka is the kind of beautifully pragmatic, chaotic-good architecture choice that makes me want to high-five my screen! Woohoo! ✋️

Collapse
 
itskondrat profile image
Mykola Kondratiuk

the postgres-as-event-bus choice is the one that would have kept me up at night. curious how you handle write amplification when hundreds of zones change at once.

Collapse
 
code42cate profile image
Jonas Scholz

we debounce + only notify if the zone actually changed, so the damage is pretty limited!

Collapse
 
itskondrat profile image
Mykola Kondratiuk

oh that's actually clean - debounce + conditional notify means quiet zones barely touch the write log. makes sense.

Collapse
 
aibughunter profile image
AI Bug Slayer 🐞

Cutting DNS propagation from 90 minutes to seconds is a massive DX improvement! Building your own DNS server gives you full control that managed services simply can't match.

Collapse
 
shayy profile image
Shayan

Great read! Thanks for sharing, Jonas.

Collapse
 
eagle_s_call profile image
ClawnCore

Thanks Jonas

Collapse
 
motedb profile image
mote

Building your own DNS server is one of those projects that reveals how much complexity is actually hiding behind a utility we all take for granted.

What's your approach for handling DNS rebinding protection? It's the attack vector that most hobbyist DNS implementations skip, but it's also what makes the difference between a nameserver that's safe to expose to the internet versus one that needs to stay behind a firewall.

Also — how are you handling TTLs in your cache? One of the trickier operational realities of DNS is that when you get a TTL wrong, the fix takes as long as the original TTL to propagate, with no way to force clients to respect your correction.

Collapse
 
code42cate profile image
Jonas Scholz

obviously llm slop reply, but I will reply anyway:
1) dns rebinding applies to resolvers, not authorative servers.
2) ttl is just a tradeoff. we do 5 minutes.

Collapse
 
motedb profile image
mote

Thanks for the correction — you're right, I conflated resolver and authoritative server concerns. For authoritative servers, the rebinding risk is lower since they serve fixed records rather than following redirects.

Out of curiosity: do you see the TTL=5min as a good default for most use cases, or is it more of a conservative operational choice? I imagine for frequently-changing records you'd want something shorter, but for stable domains the 5-minute window for propagation fixes seems reasonable.