DEV Community

Dayul Lee
Dayul Lee

Posted on • Originally published at lukyday-blog.vercel.app

Network Part 4 - Where to Split, Why to Read?

Published: April 29, 2026

October 4, 2021. Facebook, Instagram, and WhatsApp went completely dark for roughly six hours — all at once. The servers were fine. No bad deploy. A single command run during routine maintenance withdrew every one of Facebook's BGP routes. The internet forgot how to reach Facebook's data centers. Traffic had nowhere to go. Facebook ceased to exist on the internet.

The servers were running. The load balancers were healthy. Everything was fine. Requests just couldn't get in. That's what happens when traffic distribution breaks at the routing layer. No matter how well-built the system behind the load balancer is — if requests can't reach it, none of it matters.

 

Not all load balancers work the same way. Some look only at the outside of a packet and route it fast. Others open the packet, read what's inside, and decide based on the contents. In Part 1, the trade-off was clear: L4 is fast because it stays ignorant, L7 is precise because it pays to know. Load balancers face the same choice. Which layer do you split traffic at?

 

DNS Round Robin — Blind by Design

The most primitive form of load balancing starts at DNS. Register multiple server IPs under one domain, and hand out a different IP in rotation for each incoming request. That's DNS round-robin.

ref. Cloudflare Learning: What is round-robin DNS?
ref. Cloudflare Learning: What is DNS load balancing?

                ┌───────────────┐
                │     Client    │
                └───────────────┘
                        ↓
              "What's example.com?"
                        ↓
    ┌──────────────────────────────────────┐
    │               DNS Server             │
    │  (Returns a different IP each time)  │
    └──────────────────────────────────────┘
        ┌───────────────┼───────────────┐
  [1st request]   [2nd request]    [3rd request]
       ↙                ↓                ↘
 ┌────────────┐   ┌────────────┐   ┌────────────┐
 │  Server A  │   │ Server B   │   │  Server C  │
 │192.168.0.1 │   │192.168.0.2 │   │192.168.0.3 │
 └────────────┘   └────────────┘   └────────────┘

[Structural limits]

✗ Blind to server state
  → DNS keeps returning Server A even when it's overloaded
✗ Can't detect failures
  → DNS keeps responding with Server B's IP even after it goes down
✗ TTL caching
  → once a client receives an IP, it keeps hitting that server until TTL expires
Enter fullscreen mode Exit fullscreen mode

DNS round-robin looks balanced in theory.

Imagine a theme park with three parking lots — A, B, and C. The navigation app at the entrance sends cars in rotation: first to A, next to B, then C. Arithmetically balanced.

But the navigation app doesn't check the server every time. It trusts the answer it got for a fixed window. "This information is valid for 10 minutes" — timer starts, server goes unchecked. That's TTL (Time-To-Live): an expiration date on information.

This is where the breakdown happens. Picture a convoy of tourist buses arriving back-to-back. The first bus gets "go to Lot A." Every bus behind it copies that answer without checking — their devices already have it cached. The server is ready to send the next convoy to Lots B and C. But the buses aren't asking anymore. Lot A is jammed. Lots B and C sit empty.

Economist George Akerlof described this structure in his 1970 paper "The Market for Lemons" as Information Asymmetry. In the used car market, sellers know the defects; buyers don't. That gap alone distorts the entire market.

ref. Information Asymmetry

DNS round-robin works the same way. The DNS server knows Server A is overloaded. The client won't find out until TTL expires. The distribution gets skewed — not because caching is broken, but because of a structural disconnect between the party that has the information and the party that needs it.

DNS round-robin looks like load balancing. In practice, it's blind rotation.

 

L4 Load Balancer — Fast by Choice

The L4 load balancer follows the same philosophy introduced in Part 1. It doesn't open the packet. It reads only the destination address (IP) and port number on the envelope, and decides where to send it from there.

[Transport Layer]

            ┌───────────────────┐
            │   Client Request  │
            └───────────────────┘      
                      ↓
      ┌─────────────────────────────────┐
      │         L4 Load Balancer        │
      │                                 │
      │         ✓ IP address            │
      │       ✓ Check Port number       │
      │        ✗ Packet content         │
      └─────────────────────────────────┘    
        ↙             ↓             ↘
  ┌───────────┐ ┌───────────┐ ┌───────────┐
  │ Server A  │ │ Server B  │ │ Server C  │
  └───────────┘ └───────────┘ └───────────┘      
  ( Based on IP hash or least connections )        
Enter fullscreen mode Exit fullscreen mode

No content inspection means fast decisions. Millions of concurrent connections, handled. It fits environments where large numbers of clients open simple TCP connections simultaneously — game servers, for example.

L4 is a strategy that accepts the information gap. It makes routing decisions without knowing what's inside the packet. Where DNS round-robin failed because it lacked information, L4 turns that same ignorance into a deliberate choice. DNS misdistributes because it doesn't know. L4 trades knowing for speed.

The limits follow from that choice. No content visibility means no URL-based routing. You can't send /api/payments to the payments cluster and /api/products to the product cluster. You can't read cookies. Session persistence isn't possible.

 

L7 Load Balancer — Informed by Design

The L7 load balancer reads the packet. HTTP headers, URL paths, cookies, request body. It opens the envelope, reads the letter, and routes it to whoever handles that specific content.

[Application Layer]

              ┌───────────────────┐
              │   Client Request  │
              └───────────────────┘
                        ↓
            ┌────────────────────────┐
            │    L7 Load Balancer    │
            │                        │
            │  ✓ IP address / port   │
            │  ✓ HTTP method / URL   │
            │  ✓ Host header         │
            │  ✓ Cookies / body      │
            └────────────────────────┘
          ↙             ↓             ↘
    ┌───────────┐  ┌──────────┐   ┌──────────┐
    │  Payment  │  │  Product │   │   User   │
    │  Server   │  │  Server  │   │  Server  │
    └───────────┘  └──────────┘   └──────────┘
               Routing based on URL path
Enter fullscreen mode Exit fullscreen mode

Reading the URL means /api/payments goes to the payments server and /api/products goes to the product server. Reading cookies enables session persistence — if a user's cart data lives only on Server A, L7 reads the user ID from the cookie and keeps sending that user back to Server A.

L7 is a strategy that pays to close the information gap. Transaction Cost Theory from Part 2 applies here too. Acquiring information has a cost. Parsing headers, inspecting URLs, reading cookies — all of it is the price of knowing. In exchange for paying that cost, L7 can make decisions L4 simply cannot.

The trade-off is structural. Every request gets parsed and interpreted. That overhead is categorically higher than L4. As traffic scales, the cost compounds.

 

L4 vs L7 — Where the Bottleneck Is

L4 Load Balancer L7 Load Balancer
Sees IP address, port number HTTP headers, URL, cookies, request body
Speed Fast Slower
Routes by Connection count, IP hash URL path, cookies, headers
Can do Simple TCP distribution Content-based routing
A/B testing
Session persistence
Common use Game servers High-volume streaming API Gateway Microservices

Goldratt's Theory of Constraints from Network Part 1 applies directly here. The constraint is never fixed — it's wherever the system is closest to 100% saturation. The question of which OSI layer is the bottleneck becomes the question of which load balancer to use.

Concurrent connections approaching the limit: L4. Requests that need to be routed based on their content: L7. In practice, many production systems layer both — L4 receives traffic first and distributes it across server groups, then L7 handles fine-grained routing within each group.

ref. HAProxy Blog: Layer 4 and Layer 7 Proxy Mode

 

Own the Information, or Let It Go

Three systems. Same problem. Three different answers.

DNS round-robin failed because it couldn't see server state. It rotated blind. The information gap distorted the distribution.

L4 chose to give up information. It makes decisions without knowing the contents — and converts that ignorance directly into speed. The gap becomes an asset.

L7 chose to buy information. It pays in parsing time and gets precision in return. The gap gets closed at a cost.

What Akerlof showed wasn't that information gaps are inherently bad — it's that how you handle the gap is what determines the outcome. Used car markets that ignored the gap collapsed. Markets that bridged it with warranties survived.

Load balancers work the same way. Ignore the gap and you get DNS. Accept it and you get L4. Close it and you get L7.

The question isn't whether the information gap exists. It's what you do with it.

This is the same structure that's run through every part of this series. Goldratt asked where the constraint is. Coase and Williamson explained the conditions under which paying transaction costs makes sense. Akerlof showed how information gaps split behavior. Different names across four parts — but the same question underneath.

Where is the bottleneck right now, and what are you willing to give up to clear it?

 

The Bottom Line

DNS round-robin assigns slots without knowing server state. Information asymmetry distorts the distribution. L4 gives up information and gets speed. L7 acquires information and gets precision. Each approach makes a different call on where to absorb the cost.

Which layer you split traffic at isn't a technical preference — it's a trade-off decision. The same question this series has been asking from Part 1. Where is the constraint, and what do you give up to resolve it?

Know where the bottleneck is, and you'll know where to split. Know how to handle the information gap, and you'll know how to split.

Next up: everything covered so far — OSI layers, TCP handshake costs, HTTP evolution, load balancing — comes together in real systems. Three scenarios: an e-commerce platform, a live chat service, and a payment system. Where does the bottleneck form, and which choices resolve it?

Top comments (0)