NewJhez01

Posted on Jun 12

Regaining Privacy by Parsing DNS Requests Bit by Bit

#networking #privacy #showdev #softwareengineering

In today's world, telemetry and ads are everywhere. Personally, I'm sick and tired of it. I wanted my network to actually feel like my own. Then it struck me: I'm a software engineer with systems and network experience. Why not put that to use and build a DNS resolver from scratch? The protocol is well-documented for anyone willing to read RFC 1035. So I got started. Here's what I learned, you can find the project here.

For context, or if you've been snooping around my GitHub, you might have noticed I built my own HTTP server with no additional dependencies. So the natural conclusion would be: ah, this is just a copy-paste project. I thought so too, until I read the RFC and realised DNS works with bits and octets, not JSON and routes. That's when I knew this would take real effort. And that's exactly why I was excited. I once again chose Go for this since its concurrency is just excellent when handling async network traffic.

Getting Started

I got started, as I do with most projects nowadays, by setting up the Docker environment and a GitHub Actions pipeline so I could continuously produce quality tests and ensure my program would always build. I also decided to structure my project by feature instead of by layer. This is standard in Go but quite new for me. Instead of having layers such as HTTP, domain, data, and infrastructure, I organised by packages: dns for the resolver, cache for caching, and so on. This way I could move faster without breaking things, and my packages were far more modular.
After this was done, I first set up a TCP port to stream packets, as I did for my HTTP server. This was when I learnt DNS works over UDP, not TCP.

A quick aside for anyone who hasn't worked with network protocols: TCP is a stream protocol, it guarantees ordered, reliable delivery, reassembling bytes in the correct sequence. UDP, on the other hand, sends independent datagrams with no ordering or delivery guarantees, but the speed is vastly increased. It was a different approach, but it actually made setting up the server less complex. No connection handshake, no stream management, merely open a socket and handle each packet as it arrives:

addr, _ := net.ResolveUDPAddr("udp", ":5555")
conn, _ := net.ListenUDP("udp", addr)
buff := make([]byte, 512)
n, addr, _ := conn.ReadFromUDP(buff)

This had the effect that I needed fewer goroutines, since I only needed to handle the packets concurrently and not the stream as well.

Parsing the Protocol

Once the UDP server was set up I could start throwing mock DNS requests at it and subsequently parse them according to RFC 1035. I then realised I couldn't just work with byte arrays. Some fields like the ID were two octets, or 16 bits of data. These could easily be transferred into bytes with a size of 2 since 8 bits are one byte. However, where it got tricky was the flags field, a single 16-bit word packed with individual bits and small bitfields. QR is one bit to distinguish query from response. AA, TC, RD, and RA are single-bit booleans. OPCODE spans four bits, Z takes three, and RCODE another four. Extracting these meant shifting the right amount and masking with hexadecimal values:

bits := binary.BigEndian.Uint16(b)
h.Qr = QR((bits >> 15) & 0x1) // shift 15 bits left + consume 1
h.OpCode = OPCODE((bits >> 11) & 0xF) // shift 11 bits left + consume 4 
h.Aa = (bits>>10)&0x1 == 1

I kept mixing up whether to shift left or right, and whether the mask came before or after the shift. The mental model only clicked once I stopped thinking of the bytes as numbers and started seeing them as bit positions in a 16-bit window. This also meant I finally understood why 0x7 masks 3 bits but << 4 shifts them to position 4. The 7 in 0x7 is just the mask width, not the shift amount:

// The mask width, not the value being shifted written in hex:
// 0x1 = 0001 (keeps 1 bit)
// 0x3 = 0011 (keeps 2 bits)
// 0x7 = 0111 (keeps 3 bits)  ← Z field
// 0xF = 1111 (keeps 4 bits)  ← OPCODE, RCODE

Then came the counters — QDCOUNT, ANCOUNT, NSCOUNT, ARCOUNT — each 16-bit values that seemed straightforward until I realised my [2]byte array was being initialised as {1, 0} when I wrote SixteenBit{1} (my custom type of a byte array with length 2), because Go fills the first index and zeroes the rest. That gave me 256 instead of 1 in big-endian. I had to be explicit:

d.Header.QdCount = SixteenBit{0x00, 0x01}

And QNAME was its own beast. On the wire it is not a string, it is length-prefixed labels. example.com becomes 0x07 example 0x03 com 0x00. Parsing it meant reading one byte for length, slicing that many bytes for the label, and repeating until the null terminator, resulting in a state machine with a loop and switch case:

for {
    if b[0] == 0 {
        return 1, QTYPE  // null terminator, move to next state
    }
    q.Qname += string(b[1 : 1+b[0]])
    return 1 + int(b[0]), QNAME  // stay in QNAME for next label
}

Building Responses

For blocked domains I needed to return NXDOMAIN. This meant building my parser in reverse while flipping QR from query to response, setting RCODE to 3 (Name Error), and echoing the original question section back. The flags had to be packed bit by bit into a single uint16. Each field occupies a specific position: QR at bit 15, OPCODE at 11-14, AA at 10, and so on. I used bitwise OR to assemble them, shifting each value to its correct position:

var misc uint16
misc |= (uint16(d.Header.Qr) & 0x1) << 15
misc |= (uint16(d.Header.OpCode) & 0xF) << 11
if d.Header.Aa { misc |= 1 << 10 }
if d.Header.Tc { misc |= 1 << 9 }
if d.Header.Rd { misc |= 1 << 8 }
if d.Header.Ra { misc |= 1 << 7 }
misc |= (uint16(d.Header.Z) & 0x7) << 4
misc |= uint16(d.Header.Rcode) & 0xF

Then write the full header in big-endian:

b := make([]byte, 12)
copy(b[0:2], d.Header.Id[:])
binary.BigEndian.PutUint16(b[2:4], misc)
copy(b[4:6], d.Header.QdCount[:])
// ... etc

But the header alone isn't enough, a valid response must echo the original question section. QNAME, as previously noted, is length-prefixed labels. Since I stored the parsed domain as a simple string, reconstructing it meant splitting on dots, prefixing each label with its length, appending the null terminator, then tacking on QTYPE and QCLASS:

parts := strings.Split(d.Question.Qname, ".")
for _, v := range parts {
    b = append(b, byte(len(v)))
    b = append(b, v...)
}
b = append(b, 0x00)  // root terminator
b = append(b, d.Question.Qtype[0], d.Question.Qtype[1])
b = append(b, d.Question.Qclass[0], d.Question.Qclass[1])

I got the root terminator wrong twice. Without it, the response is malformed and the client rejects it silently. Seeing dig return SERVFAIL with no explanation taught me to trust the hex dump over the error message.
Parsing the Answer
The really tricky part, however, was reverse-parsing the answer that came back from Cloudflare. After forwarding my query to 1.1.1.1:53, I had to extract the IP from the response to cache it.
Since I knew the question length myself, I could calculate exactly where the answer section began:

answerOffset := 12 + questionLen  // header + question

That was the only easy part. The answer NAME uses the same length-prefixed labels as QNAME, but with a twist: compression pointers. A pointer is a 2-byte value starting with 11 in the top bits that says "the rest of this name lives at offset X." This avoids repeating domain names across the message. So 0xC0 0x0C means "go read the name from offset 12" — right where the question's QNAME sits.
I had to solve this with two separate walks over the bytes. First, a linear count to know how many bytes the answer NAME occupied so I could find where TYPE starts:

func CountNameLen(b []byte, consumed int) int {
    if b[0]&0xC0 == 0xC0 {
        return consumed + 2  // pointer, done counting
    }
    label := int(b[0])
    if label == 0 {
        return consumed + 1  // null terminator
    }
    return CountNameLen(b[label+1:], consumed+1+label)
}

Second, a recursive parse to actually build the domain string, following pointers wherever they lead:

func (a *Answer) ParseName(b []byte, cursor int) {
    if b[cursor] == 0 {
        return
    }
    if b[cursor]&0xC0 == 0xC0 {
        restBits := binary.BigEndian.Uint16(b[cursor:])
        a.ParseName(b, int(restBits&0x3FFF))  // follow pointer
        return
    }
    length := int(b[cursor])
    a.Name = append(a.Name, string(b[cursor+1:cursor+1+length]))
    a.ParseName(b, cursor+1+length)
}

The recursive call was the breakthrough. I sat back and laughed once I realised the protocol wasn't arbitrary, it was recursive by design. Three cases, one function, infinite descent until you hit zero.
After the name, the rest was predictable. TYPE and CLASS are 2 bytes each, TTL is 4 bytes, RDLENGTH is 2 bytes, and RDATA is whatever length RDLENGTH specifies:

a.Type = SixteenBit{b[0], b[1]}
a.Class = SixteenBit{b[2], b[3]}
a.Ttl = binary.BigEndian.Uint32(b[4:8])
rdlength := binary.BigEndian.Uint16([]byte{b[8], b[9]})
a.Rdata = b[10 : 10+rdlength]  // 4 bytes = IPv4 address

Once I had the IP and TTL, I cached them and built the response the same way as NXDOMAIN, just with RCODE = 0 (NoError) and the answer section tacked on. The next query for the same domain hit the cache and skipped Cloudflare entirely. Thank god for unit and integration tests or I would still be puzzled on why my bit count is off when parsing the answer.

Architecture: Interfaces and Dependencies

With parsing and responses working, I set up Redis for caching. I cached valid answers from the upstream resolver with a TTL so repeated requests wouldn't hit the database. I also cached block decisions, if a domain was malicious, I'd store that too for quick rejection.
The cache value needed to hold both the answer and whether the domain was blocked, so I used JSON to serialize a small struct:

type Value struct {
    Ip        string
    IsBlocked bool
}

I hid the actual Redis implementation behind an interface so the rest of my program had no dependency on Redis, applying the reverse dependency injection principle and keeping everything easily unit testable and modular. The same was done for SQLite:

type Cache interface {
    GetDomainNameFromCache(ctx context.Context, key string) (Value, error)
    SetDomainName(ctx context.Context, key string, v Value, ttl time.Duration) error
}

The concrete Redis implementation lives in internal/cache/redis/, and main.go wires it all together. Swap Redis for Memcached tomorrow and only main.go changes.
For upstream resolution, I forward allowed queries to Cloudflare at 1.1.1.1:53 with a timeout:

func forwardToUpstream(query []byte) ([]byte, error) {
    conn, _ := net.Dial("udp", "1.1.1.1:53")
    defer conn.Close()
    conn.SetDeadline(time.Now().Add(3 * time.Second))
    conn.Write(query)
    resp := make([]byte, 512)
    n, _ := conn.Read(resp)
    return resp[:n], nil
}

Lastly was saving the blocked data in the SQLite database. I decided to run a migration on startup and used WITHOUT ROWID since the domain name is the natural primary key there was no need for a separate rowid table:

func (r *SqliteClient) Migrate() error {
    _, err := r.db.Exec(`
        CREATE TABLE IF NOT EXISTS blocked_domains (
            domain TEXT PRIMARY KEY
        ) WITHOUT ROWID
    `)
    return err
}

Testing

Writing unit tests for the parsers gave me some confidence, but I wanted more. I wanted to know the whole system worked end-to-end — UDP server, SQLite blocklist, Redis cache, and Cloudflare upstream all talking to each other. So I wrote an integration test.

That decision forced me to refactor my server. The original Serve function just spun up a goroutine and blocked forever on a signal channel. There was no way to stop it cleanly from a test. I refactored it to use Start and Stop methods with context.Context and sync.WaitGroup for graceful shutdown:

type Server struct {
    conn    *net.UDPConn
    rClient cache.RedisClient
    s       blocklist.SqliteClient
    wg      sync.WaitGroup
    ctx     context.Context
    cancel  context.CancelFunc
}

func New(redisClient cache.RedisClient, b blocklist.SqliteClient) *Server {
    ctx, cancel := context.WithCancel(context.Background())
    return &Server{
        rClient: redisClient,
        s:       b,
        ctx:     ctx,
        cancel:  cancel,
    }
}

func (srv *Server) Start() error {
    addr, err := net.ResolveUDPAddr("udp", os.Getenv("UDP_PORT"))
    if err != nil {
        return err
    }
    conn, err := net.ListenUDP("udp", addr)
    if err != nil {
        return err
    }
    srv.conn = conn
    srv.wg.Add(1)
    go srv.listen()
    return nil
}

func (srv *Server) Stop() {
    srv.cancel()
    if srv.conn != nil {
        srv.conn.Close()
    }
    srv.wg.Wait()
}

The WaitGroup ensures Stop blocks until the goroutine actually exits. cancel() signals intent, conn.Close() breaks the blocking ReadFromUDP, and the listener checks ctx.Err() to distinguish intentional shutdown from real errors:

func (srv *Server) listen() {
    defer srv.wg.Done()
    buff := make([]byte, 512)
    for {
        select {
        case <-srv.ctx.Done():
            return
        default:
        }
        n, addr, err := srv.conn.ReadFromUDP(buff)
        if err != nil {
            if srv.ctx.Err() != nil {
                return // graceful shutdown
            }
            log.Printf("failed to read from udp: %s", err)
            continue
        }
        resp, err := dns.Resolve(buff[:n], &srv.rClient, &srv.s)
        if err != nil {
            log.Printf("failed to resolve: %s", err)
            continue
        }
        if _, err := srv.conn.WriteToUDP(resp, addr); err != nil {
            log.Printf("failed to write response: %s", err)
        }
    }
}

With that in place, the integration test spins up the full stack, in-memory SQLite, Redis, and the UDP server on a random port, queries it with net.Resolver, and asserts both blocked and non-blocked behaviour:

func TestResolver(t *testing.T) {
    // SQLite in memory for isolation
    db, err := blocklist.CreateNewDbConn(":memory:")
    if err != nil {
        t.Fatalf("failed to create db: %v", err)
    }
    ctx, cancel := context.WithTimeout(context.Background(), 3*time.Second)
    defer cancel()
    db.Migrate(ctx)
    db.Db.Exec("INSERT INTO blocked_domains (domain) VALUES ('doubleclick.net')")

    // Redis
    redisClient := redis.NewClient(&redis.Options{
        Addr: os.Getenv("REDIS_URL"),
    })

    // Server on random port
    os.Setenv("UDP_PORT", "127.0.0.1:0")
    srv := server.New(*cache.CreateNewRedisClient(redisClient), *db)
    if err := srv.Start(); err != nil {
        t.Fatalf("failed to start server: %v", err)
    }
    defer srv.Stop()

    port := srv.Conn.LocalAddr().(*net.UDPAddr).Port

    r := &net.Resolver{
        PreferGo: true,
        Dial: func(ctx context.Context, network, address string) (net.Conn, error) {
            return net.Dial("udp", fmt.Sprintf("127.0.0.1:%d", port))
        },
    }

    ctx, cancel = context.WithTimeout(context.Background(), 5*time.Second)
    defer cancel()

    // Non-blocked domain resolves to real IP
    ips, err := r.LookupHost(ctx, "google.com")
    if err != nil {
        t.Fatalf("lookup google.com failed: %v", err)
    }
    if len(ips) == 0 {
        t.Fatal("expected IPs for google.com, got none")
    }
    t.Logf("google.com -> %v", ips)

    // Blocked domain returns NXDOMAIN
    _, err = r.LookupHost(ctx, "doubleclick.net")
    if err == nil {
        t.Fatal("expected error for blocked domain, got nil")
    }
    if dnsErr, ok := err.(*net.DNSError); !ok || !dnsErr.IsNotFound {
        t.Fatalf("expected NXDOMAIN for doubleclick.net, got: %v", err)
    }
    t.Logf("doubleclick.net -> blocked (NXDOMAIN)")
}

The unit tests caught parsing bugs — off-by-one errors, wrong bit shifts, missing null terminators. The integration test caught the real stuff: port conflicts, Redis connection failures, the server not shutting down between tests, and my Z-field validation rejecting legitimate EDNS0 responses from Cloudflare. Thank god for both or I would still be puzzling over why dig returned SERVFAIL with no explanation.

Closing Notes

Of course writing unit tests for all the parsers gave me some security but I was still pleasantly surprised when everything on the Pi worked flawlessly. I am really proud of building the custom parsers just by spec, learning about bitwise operators, and setting it all up to work and be performant enough for me to not notice any hit on my home network. I took back my privacy and increased my network knowledge.

The code is on GitHub if you want to set it up for yourself or leave any feedback, I would love to hear from you. If you would like to follow me on my journey deepening my understanding of systems and networks, stay tuned for my next project which will likely be something in the direction of WebSockets or distributed KV stores.

Top comments (2)

VoltageGPU • Jun 16

Interesting take on DNS privacy — I've been looking into similar issues with side-channel leaks in GPU-based workloads too. At VoltageGPU, we're focused on making sure computation stays isolated at the hardware level, which has some parallels with what you're doing at the network layer.

NewJhez01 • Jun 23

Awesome that sounds super interesting would love to hear more if you want to share :)