Dave Cridland

Posted on Nov 9

It's always DNS

#aws #devops #networking

It is always DNS... Sometimes

It might seem as if the DNS, or the "Domain Name System", is highly unreliable. A major AWS outage was traced to an incorrectly set-up DNS entry, and the old hands (and new hands) in the industry all smiled knowingly and said "it's always DNS", even though it was really the DynamoDB global table names, the DNS was working perfectly, and nobody else would have been using that DNS entry directly.

The DNS isn't really the cause of many outages and problems - but it is involved in lots of them, and has privacy leaks and performance issues. And this is because it is usually so reliable we use it for literally everything, and use it with an intensity that results in us pouring information into it as well as relying totally on the information we get back.

In this post, I'll explain why we have the DNS, how it works, how to make it faster, and how to make it more private. And how it goes wrong.

And also I'll prove, beyond a shadow of a doubt, that my servers are faster than Google's.

A Gentleman's Primer In DNS

So let's remind ourselves about the Domain Name System. The Internet operates on addresses - originally NCP, then IPv4, and then more recently, often IPv6. These are magic numbers - we tend to think of IPv4 addresses as four 8-bit numbers, but really it's a single 32-bit number, and IPv6 addresses are a giant 128-bit number. But thanks to Vint Cerf's epiphany on the back of a napkin, they're split at multiple points to form subnet addresses, which makes routing - the decision of where to send the packets - really easy (and fast!).

Many of us probably remember half a dozen IPv4 addresses without thinking. We're in the habit of typing 127.0.0.1 instead of localhost, and these days having to train ourselves the other way lest we miss out on ::1, the IPv6 version. If you've a static IPv4 address from your ISP, you might memorize that.

But you probably don't memorise all the possible IP addresses for Google, for example - instead you want to type www.google.com and get back the address.

This was originally solved on the Internet by the simple method of having a text file containing the names and their addresses, and sharing this around via email or whatever. Convenient, but it turns out this didn't scale well - though it still exists as /etc/hosts on UNIX systems as a fallback.

And, moreover, you might want to know more than just the address - you might want to know how to send email there, or where the XMPP server is, or what settings to use for TLS, and so on - making a single text file get fairly complicated.

What we needed was a distributed database.

A Lady's Primer in the DNS

First things first: because it's "The Domain Name System", real experts call it "The DNS". Top tip to make yourself look like you know what you're talking about, that.

Domain names are a sequence of labels, each followed by a dot. Each label is - normally - lower case alphanumeric characters, plus '-'. Note that at a low level, the DNS never handles Unicode, but from a purely technical standpoint, it can have labels with anything at all in (including, wildly, dots and spaces).

The DNS has the concept of a "zone" - something you might normally call a "domain", though it's slightly different. Zones contain records - the only mandatory records are the "Start Of Authority" (SOA) record, which lists settings for the zone, and the "Name Server" (NS) records, which list the nameservers. Those are given as fully qualified domain names themselves, which does give us an interesting chicken and egg problem. These records will have a key which will be the same domain name as the name of the zone.

NS records with a different domain name essentially state that the listed nameservers will answer queries for this other domain (as another zone).

A and AAAA records list the IPv4 and IPv6 addresses for the label. There's other record types too - I'm going to skip over them, but they're often quite important.

To query the DNS from first principles, you'll need a list of the Root Servers. We can get one just by using dig, like this:

dig . NS

This is asking for the nameservers for the domain name ".". Remember I said it was labels followed by a dot? Google's domain is really "google.com.", but we skip the last dot usually. Putting it there - on some programs at least - tells the program not to look on our local domains.

The response will be something like:

; <<>> DiG 9.18.39-0ubuntu0.24.04.2-Ubuntu <<>> . NS
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 54773
;; flags: qr rd ra; QUERY: 1, ANSWER: 13, AUTHORITY: 0, ADDITIONAL: 27

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;.              IN  NS

;; ANSWER SECTION:
.           309030  IN  NS  e.root-servers.net.
.           309030  IN  NS  h.root-servers.net.
.           309030  IN  NS  a.root-servers.net.
.           309030  IN  NS  j.root-servers.net.
.           309030  IN  NS  b.root-servers.net.
.           309030  IN  NS  l.root-servers.net.
.           309030  IN  NS  c.root-servers.net.
.           309030  IN  NS  g.root-servers.net.
.           309030  IN  NS  d.root-servers.net.
.           309030  IN  NS  f.root-servers.net.
.           309030  IN  NS  k.root-servers.net.
.           309030  IN  NS  i.root-servers.net.
.           309030  IN  NS  m.root-servers.net.

;; ADDITIONAL SECTION:
a.root-servers.net. 309030  IN  A   198.41.0.4
b.root-servers.net. 309030  IN  A   170.247.170.2
c.root-servers.net. 309030  IN  A   192.33.4.12
d.root-servers.net. 309030  IN  A   199.7.91.13
e.root-servers.net. 309030  IN  A   192.203.230.10
f.root-servers.net. 309030  IN  A   192.5.5.241
g.root-servers.net. 309030  IN  A   192.112.36.4
h.root-servers.net. 309030  IN  A   198.97.190.53
i.root-servers.net. 309030  IN  A   192.36.148.17
j.root-servers.net. 309030  IN  A   192.58.128.30
k.root-servers.net. 309030  IN  A   193.0.14.129
l.root-servers.net. 309030  IN  A   199.7.83.42
m.root-servers.net. 309030  IN  A   202.12.27.33
a.root-servers.net. 309030  IN  AAAA    2001:503:ba3e::2:30
b.root-servers.net. 309030  IN  AAAA    2801:1b8:10::b
c.root-servers.net. 309030  IN  AAAA    2001:500:2::c
d.root-servers.net. 309030  IN  AAAA    2001:500:2d::d
e.root-servers.net. 309030  IN  AAAA    2001:500:a8::e
f.root-servers.net. 309030  IN  AAAA    2001:500:2f::f
g.root-servers.net. 309030  IN  AAAA    2001:500:12::d0d
h.root-servers.net. 309030  IN  AAAA    2001:500:1::53
i.root-servers.net. 309030  IN  AAAA    2001:7fe::53
j.root-servers.net. 309030  IN  AAAA    2001:503:c27::2:30
k.root-servers.net. 309030  IN  AAAA    2001:7fd::1
l.root-servers.net. 309030  IN  AAAA    2001:500:9f::42
m.root-servers.net. 309030  IN  AAAA    2001:dc3::35

;; Query time: 5 msec
;; SERVER: 127.0.0.53#53(127.0.0.53) (UDP)
;; WHEN: Sat Nov 08 17:14:21 GMT 2025
;; MSG SIZE  rcvd: 811

The actual answer to what we asked is in the "Answer Section" - which is fair enough, really. It tells us that there's a set of Root Servers numbered - okay, lettered - from a to m. You'll notice that in the answer, they're jumbled up - they're all equal, and if a were always listed first, it'd risk getting all the traffic.

However, since this just gives us domain names, and we need to use the DNS to look those up, this would be fairly useless - so instead of just that, it gives us "Additional" records which it thinks we will find useful. This includes all the address records, in this case, which is just as well. These are called "Glue Records", and are usually very helpful. And that, right there, was dramatic foreshadowing, that was.

Glue Records are needed because the only way to look up x.root-servers.net. A would be by knowing where root-servers.net. NS was, and we don't know that without knowing the addresses for the root servers to ask where to find net. NS.

You can't just ask the root servers to look everything up for you, though, because there's two kinds of DNS server.

Authoritative Servers actually hold the data for a particular zone - the Root Servers are the Authoritative Servers for ".". They will typically only provide answers for the zone they're authoritative for. (There might be a "primary" and several "secondaries" for a zone, but that's a detail that doesn't matter to us - they're all authoritative and you can't tell where the data is actually sourced from).

Recursive Servers (also known as Resolvers) don't hold authoritative data at all, they're just there to make our lives easier and faster. These will have that list above statically stored - remember that before the DNS, we had to share text files with all the servers' names in? Now we just share the root server list around. It's not as big.

However, Authoritative Servers do answer queries for subdomains (or "delegated zones"), at least by handing over the NS records of the delegated zone.

So when we type:

dig www.google.com. A +aaonly +norecurse @<ip address for root server>

... we get back the servers not for Google, but for "com.". We can then pick one of those, and ask again, and we'll get Google's nameservers (via a Glue Record in "com.", by the way), and finally, by asking them, we'll get our address.

(Asides: Why did I elide the IP address there, despite listing them all above? Because nobody wants every person reading this to hit the same root server. No really. Why did I include the + flags? To stop the root server operating in recursive mode - which it won't, but it might cause additional logging for those folks running them. Why am I querying for an IPv4 record over IPv6? Well, no reason, really.)

dig www.google.com +aaonly +norecurse A @2001:4860:4802:34::a

.. fill finally give us:

;; QUESTION SECTION:
;www.google.com.            IN  A

;; ANSWER SECTION:
www.google.com.     300 IN  A   142.250.140.105
www.google.com.     300 IN  A   142.250.140.147
www.google.com.     300 IN  A   142.250.140.106
www.google.com.     300 IN  A   142.250.140.99
www.google.com.     300 IN  A   142.250.140.104
www.google.com.     300 IN  A   142.250.140.103

;; Query time: 37 msec

And that's all great whelp it takes 37ms for each query and how many queries?

This is too slow.

Cache and Carry

Cache and Grab

Recursive nameservers do two things for us - they'll handle the multiple queries so we only need to do one, and they'll also cache the request for us too.

The numbers (300 in the last response) tell us the "Time To Live", or "TTL", on the record. Google is telling me that I can cache this one for 300 seconds (5 minutes). Since all the records have this, I can drop the number of queries down to, well, none.

Our client machine can also cache, but typically a recursive resolver runs at your ISP, and will cache for all its customers.

This means the DNS can be really much faster. My ISP will return the result in (typically) 18-20ms, because it'll be fully cached. If it isn't, it'll take closer to 27-30ms.

These are so useful that they're part of the connection data you'll get as you connect. Your phone or laptop uses DHCP; your router will get the same via PPP options. Finding your local resolver means getting a faster DNS service - and thus a faster connection.

Cache is King

To make things even faster, you can (remarkably easily) run your own recursive nameserver at home, on a Raspberry Pi or similar. I run two, in fact, on "real" Linux machines, because I am perfectly normal and in no way stupidly nerdy. These will return me answer (cached, of course) in about 1-2ms.

If only we could measure this over months and have graphs. (I do. Of course I do).

Here's my ISP's DNS server (or one of them), for the past 10 days:

And here's mine:

And for comparison here's Google's 8.8.8.8 public DNS resolvers:

I include this as definitive proof that my servers are faster than Google's, incidentally. Honestly, sometimes I'm such a genius I even impress myself.

Speed, Reliability, and Privacy

Almost every connection to any service will involve, as a first step, a DNS lookup. In fact, the only exception I can think of is the DNS root server queries themselves.

That means - quite obviously - speed is crucial. That 20 (or 2) millisecond DNS tax is paid for every single connection.

It means reliability is crucial. If the DNS fails, then of course the connection cannot continue.

And DNS can fail - of course it can - in numerous ways.

How can I fail thee? Let me count the ways

Bad Answers

It goes without saying - but I'll say it anyway of course - that if the data in the authoritative DNS servers is wrong, you'll get wrong answers, and things won't work.

If a lookup for www.google.com. gave me the IPv4 address of 127.0.0.1, it's going to be game over. This is the kind of failure that brought down AWS, incidentally - everything worked perfectly, but the wrong DNS records were put in, and thus the wrong ones came out.

Obviously "It's always DNS" applies here, but really, it was Terraform or Cloud Formation or something.

A more common case is where the service is intentionally moving, and the caching of DNS records causes resolvers to provide the old address. (Fix: lower TTLs the day before a move!)

Bad Glue

Oh, wait, I foreshadowed this! I hope you were paying attention.

Glue Records are vital for locating nameservers when the nameservers' names are in the zone they're serving. But - and you'll like this pun, I promise - when glue records are missing or incorrect, things can come unstuck.

A classic case is where a nameserver moves address - this is so rare that the precise things needed are sometimes forgotten, like not only updating the authoritative records but the glue as well.

Suddenly the entire zone drops offline - but only for those resolvers using the wrong glue address.

It's maddeningly difficult to diagnose such cases because any resolver with the NS records cached might never see the problem, only "new" traffic will fail, and even then perhaps only half of it.

DNS Server Offline

Obviously if there's no DNS server listening at the right IP address, you won't get a response. DNS operates (at least usually) over UDP, so packets get lost, and we have to retry. If nothing's there we'll retry a lot.

Even if the server is present, but the link is lossy, this can add a substantial amount of time to the lookup - and that will add up surprisingly quickly into a delay that an actual human will notice as being "a slow site".

"Lame" Delegation

Maybe there is a server that responds, but maybe it says it's not the authoritative server the parent said it was? These are known as "Lame" delegations, and might be down to bad answers in the parent, or bad glue.

And Privacy?

Privacy is a bigger problem than just the DNS, but DNS servers are a key part of the problem.

If every site you visit gets looked up in DNS, then the DNS server you're using gets every site you visit.

When that's your ISP (at least in sane jurisdictions), that's most likely fine. You have a contract with your ISP, and in most sane jurisdictions they'd have to tell you, at least, if they were gathering the data (and generally they'd have to ask your permission).

But if it's a rando WiFi in a café? Yeah, they should still ask, but honestly, who knows?

DoT and DoH

DoT, or DNS over TLS, allows you to directly use a particular DNS provider over a TLS encrypted connection. This is good for privacy, but comes with a cost - the provider isn't local to you, and will be slower.

DoH is much the same, but uses HTTPS rather than simply running the DNS protocol directly over TLS - it's useful for cases where other protocols may be blocked at the firewall, or where you want to minimize the likelihood that someone can tell it's DNS traffic rather than just another website (but they can tell anyway).

There are privacy-focused providers, such as Quad9, and there's also Google. As ever, encryption doesn't solve problems, it just moves them, and a DoT/DoH provider like Quad9 or Google could choose to log all your queries just like a nefarious café owner.

At home, you're likely paying a "tax" of around 5ms for using these by my measurements - so I'd recommend not to bother unless you genuinely cannot trust your ISP. But when out of the house, you probably want to, so they're worth considering for mobile devices and laptops you use on arbitrary WiFis.

I do not understand why ISPs don't offer DoT, DoH, and indeed full-blown VPN access for their customers. If you're an ISP reading this, this is a free product idea and I'd love you to do it.

DNSSEC

A parallel security issue with the DNS (in general, but particularly problematic in rando WiFi situations) is that the operator of a DNS service - whether or not it's using TLS - can simply fake the answers, redirecting you to a site under their control.

Assuming you're using TLS (usually via HTTPS) this isn't as big a problem as you'd think, surprisingly, but they can generate a lot of confusion. And if a Certificate Authority gets duped into handing out a certificate, it becomes a nasty issue.

DNSSEC allows each individual record to be signed, which means that even when the records are cached, they're still trustworthy.

HTTPS

HTTPS is - of course - a key part of network security and privacy, but I'm not going into it in this article at all. This, right here, is the only mention of SNI, and ECH, and I'm not even expanding them.

Sorry. I'll write something else on these later!

It really is always the DNS, sort of.

The DNS is a crucial, critical, and central part of the Internet. While it's not quite a single point of failure, DNS problems have an outsized effect on the rest of the system.

I remember back in around 1997 or so - 17th July 1997, I looked it up - every root server dropped offline (except k), and caused mass disruption for several hours. Just 10 machines offline and the Internet - as a whole - "went down". These days there's a couple more root servers, and besides, each one of those IP addresses goes to several - perhaps hundreds - of actual servers. I don't think any root server has been offline since.

And even today, relatively isolated DNS problems - or even just problems with the data in the DNS - can result in much of AWS dropping offline for hours.

But ultimately, most of the "it's always DNS" problems - including that AWS outage - turn out to have root causes outside of DNS. The average DNS server looks at 99.999% uptime and considers it amateurish. Really, it's always ARP, or Terraform, or a digger. That mass root server outage in 1997? A US exchange point, MAE-East, had a bad configuration update to a router and went offline. The servers themselves were actually fine.

It's just that DNS will often be the first thing you'll notice go as a result of any outage at all - and the knock-on effects will be massive.

Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.