DEV Community

Rajkiran
Rajkiran

Posted on

System Design - 11. CDNs Explained: Why Your Netflix Video Loads Faster Than Your Email

Covers: How CDNs Work, Pull vs Push CDN, Edge Caching, Dynamic Content, Multi-CDN, Real Incidents


The Fastly Outage That Broke the Internet for an Hour

On June 8, 2021, at 9:47 AM UTC, a significant portion of the internet went down.

Reddit, The Guardian, Twitch, GitHub, Stack Overflow, CNN, the UK Government website — all simultaneously returned errors or blank pages.

The cause? A single customer at Fastly (a CDN) triggered a bug in Fastly's software by changing a configuration setting. Within seconds, 85% of Fastly's network returned errors globally.

One CDN. One bug. One hour. A massive slice of the internet — gone.

This incident is the clearest possible demonstration of how deeply CDNs are woven into modern web infrastructure. And once you understand why they're that central, you'll understand one of the most powerful architectural patterns in distributed systems.


What Is a CDN, Really?

Content Delivery Network — a geographically distributed network of servers (called edge servers or Points of Presence / PoPs) that cache and serve content from locations close to end users.

The core problem CDNs solve: physics.

Light travels through fiber optic cables at about 200,000 km/second. A round-trip from Mumbai to a server in Virginia (US-East) and back covers roughly 16,000 km — taking at minimum 80ms just for the signal to travel, before any processing.

Without CDN:
User in Mumbai → request → Server in Virginia (80ms RTT minimum)
                         → response back to Mumbai
Total: 160ms+ just for network travel

With CDN:
User in Mumbai → request → CDN edge in Mumbai (1-2ms RTT)
                         → response from local edge
Total: 3-5ms for cached content
Enter fullscreen mode Exit fullscreen mode

A CDN brings the content to your users instead of bringing your users to your server. At global scale, this isn't a nice-to-have — it's the difference between a working product and an unusable one.


What CDNs Actually Cache

CDNs were originally designed for static content — files that don't change per user:

  • Images, videos, audio files
  • CSS stylesheets and JavaScript bundles
  • HTML pages (for static sites)
  • Font files
  • Software downloads

But modern CDNs do much more than static caching.


Pull CDN vs Push CDN

The fundamental architectural decision in CDN configuration: how does content get onto the edge server?

Pull CDN (Lazy Caching)

The CDN pulls content from your origin server on demand — when a user first requests it.

First user request (cache MISS):
User → CDN Edge (Mumbai) → "I don't have this" → Origin Server (Virginia)
Origin → CDN Edge → CDN Edge caches it → User
(Slow: full round trip to origin)

All subsequent requests (cache HIT):
User → CDN Edge (Mumbai) → "Got it!" → User
(Fast: served from local edge)
Enter fullscreen mode Exit fullscreen mode

TTL (Time to Live): Cached content expires after a configured duration. After expiry, the next request triggers a fresh pull from origin.

Advantages:

  • No upfront work — you don't need to manually push content anywhere
  • Only popular content gets cached (unpopular content that nobody requests never wastes edge storage)
  • Simple to set up — point your CDN at your origin and you're done

Disadvantages:

  • First request after cache miss (or expiry) is slow — full round trip to origin
  • If your origin is slow or goes down, cold cache requests fail
  • TTL tuning is tricky — too short and you hammer the origin, too long and users see stale content

Best for: Websites, web apps, APIs with unpredictable access patterns. Most CDNs (Cloudflare, AWS CloudFront, Akamai) default to Pull CDN behavior.


Push CDN

You proactively push content to CDN edge servers before any user requests it.

You deploy new content:
Your Server → pushes to → CDN Edge (Mumbai)
                        → CDN Edge (London)
                        → CDN Edge (Singapore)
                        → CDN Edge (São Paulo)
                        → ... (all PoPs worldwide)

User request:
User → CDN Edge → already there → User
(Always fast, even on first request)
Enter fullscreen mode Exit fullscreen mode

Advantages:

  • No cold-start latency — content is already at the edge before the first user
  • Origin server is never hit by user traffic (fully shielded)
  • Content is available even if origin is completely down

Disadvantages:

  • You must manage what's pushed and when — more operational complexity
  • Storage costs more (content exists at all edges even if nobody in São Paulo ever requests it)
  • Updating content requires re-pushing everywhere

Best for: Large files that are known in advance — software releases, game patches, movie releases on a streaming service. Netflix uses Push CDN for pre-positioning popular movies to edge servers before prime time viewing hours. When you press play at 8pm, the video is already at an edge node near you.


Edge Caching Deep Dive

Cache Keys

The CDN uses a cache key to determine if a cached response can be served. By default, the cache key is the full URL:

https://api.example.com/products/123?color=red  → unique cache key
https://api.example.com/products/123?color=blue → different cache key
Enter fullscreen mode Exit fullscreen mode

Custom cache key design matters enormously:

  • Including unnecessary query parameters creates cache fragmentation (lots of unique keys, low hit rate)
  • Excluding important parameters serves wrong content (a user gets another user's personalized response)

Best practice:

Include in cache key: path, essential query params (page, id, category)
Exclude from cache key: tracking params (utm_source, fbclid, etc.)
Include: Vary headers for content negotiation (Accept-Encoding, Accept-Language)
Exclude: User-Agent (too many variations, shatters cache)
Enter fullscreen mode Exit fullscreen mode

Origin Shielding

Without origin shielding, all 200+ CDN edge PoPs worldwide can make individual requests to your origin on cache misses:

Without origin shielding:
200 edge PoPs × cache miss at same time = 200 simultaneous origin requests
Enter fullscreen mode Exit fullscreen mode

Origin shielding adds a "shield" PoP that all edge PoPs funnel through:

With origin shielding:
200 edge PoPs → all cache misses go to Shield PoP (e.g., London)
Shield PoP → only 1 request to origin
Origin only gets 1 request instead of 200
Enter fullscreen mode Exit fullscreen mode

This dramatically reduces origin load and provides an additional layer of caching. AWS CloudFront calls this "Origin Shield." Cloudflare calls it "Tiered Cache."


CDNs for Dynamic Content: Edge Computing

This is where CDNs have evolved dramatically in the last 5 years.

Traditional CDNs cache static files. But edge computing lets you run code at edge locations — making decisions and generating responses without ever hitting your origin.

Cloudflare Workers:
JavaScript/WebAssembly code running at Cloudflare's 300+ edge locations. Zero cold start (unlike Lambda). Executes in milliseconds at the edge.

// Cloudflare Worker: personalize response at the edge
addEventListener('fetch', event => {
  event.respondWith(handleRequest(event.request))
})

async function handleRequest(request) {
  const country = request.cf.country  // User's country from CDN metadata

  // Return country-specific content without hitting origin
  if (country === 'DE') {
    return new Response(germanHomepage, { headers: {'Content-Type': 'text/html'} })
  }
  return fetch(request)  // Default: pass through to origin
}
Enter fullscreen mode Exit fullscreen mode

Use cases for edge computing:

  • A/B testing (decide which variant to show at the edge)
  • Authentication (validate JWT tokens at the edge, reject bad requests before they reach origin)
  • Geo-blocking (block requests from certain countries at the CDN level)
  • Request/response transformation (modify headers, rewrite URLs)
  • Personalization based on user location or device type

The architectural win: Every computation moved to the edge is a computation that doesn't burden your origin servers — and that runs 50-100ms faster for the user.


Multi-CDN Strategy: Not Putting All Eggs in One Basket

The Fastly outage made this obvious: a single CDN is a single point of failure.

Multi-CDN means routing traffic across multiple CDN providers simultaneously:

DNS → [Traffic Manager (Route 53, etc.)]
         ├── 50% → Cloudflare CDN
         ├── 30% → Akamai CDN
         └── 20% → Fastly CDN
Enter fullscreen mode Exit fullscreen mode

When one CDN has an outage, the traffic manager detects it via health checks and reroutes:

Fastly incident detected:
Traffic Manager → 0% to Fastly (unhealthy)
               → 60% to Cloudflare
               → 40% to Akamai
Enter fullscreen mode Exit fullscreen mode

Additional advantages:

  • Use the fastest CDN per region (Akamai is stronger in certain regions, Cloudflare in others)
  • Negotiate better pricing by splitting volume across providers
  • Test new CDN providers with small traffic percentages

Operational complexity: You now manage relationships, configurations, and purge operations across multiple CDNs. Most teams only go multi-CDN once they've experienced an outage at sufficient scale that the complexity is worth it.


Cache Invalidation at CDN Scale

When your content changes, how do you remove stale copies from hundreds of edge servers worldwide?

Option 1: TTL Expiry

Set a short TTL. Stale content is served only until TTL expires, then edges pull fresh content.

Cache-Control: max-age=300  (content expires after 5 minutes)
Enter fullscreen mode Exit fullscreen mode

Problem: Between publish and TTL expiry, stale content is served. For a news article updated with breaking information, 5 minutes of stale content is acceptable. For a product price change, maybe not.

Option 2: Purge / Cache Invalidation API

CDNs expose an API to explicitly remove cached content:

# Cloudflare: purge specific URLs
curl -X POST "https://api.cloudflare.com/client/v4/zones/{zone_id}/purge_cache" \
  -H "Authorization: Bearer {api_token}" \
  -H "Content-Type: application/json" \
  --data '{"files":["https://example.com/products/123.jpg"]}'

# AWS CloudFront: create invalidation
aws cloudfront create-invalidation \
  --distribution-id EDFDVBD6EXAMPLE \
  --paths "/products/123.jpg" "/products/123-thumb.jpg"
Enter fullscreen mode Exit fullscreen mode

Propagation time: Purges propagate to all edge PoPs typically within 1-30 seconds. Not instantaneous — there's a brief window where some edges serve stale content.

Option 3: Cache-Busting (Fingerprinted URLs)

Instead of invalidating, change the URL when content changes:

Old: /static/app.js          (cached everywhere)
New: /static/app.a3f4b2c.js  (hash of file content in filename)
Enter fullscreen mode Exit fullscreen mode

Since the URL is different, CDN has no cached version — it fetches fresh content automatically. Old URL remains cached (but is no longer referenced by your HTML).

This is the default strategy for CSS/JS assets in every modern build tool (Webpack, Vite, etc.). The hash changes when file content changes, guaranteeing users get the latest version.


Real Systems

YouTube / Google CDN:
YouTube serves 1 billion hours of video per day. They run their own CDN infrastructure globally. For the most popular content, they use Push CDN — hot videos are pre-positioned at every major edge location. Your YouTube video loads in seconds because it's likely already cached at an edge within 50km of you.

Cloudflare:
150 million+ internet properties. 300+ PoPs globally. Handles 45 million HTTP requests per second at peak. Their network is so large they are themselves upstream internet infrastructure — they peer directly with ISPs rather than buying transit.

Akamai:
Serves 15-30% of all internet traffic (by some estimates). Specializes in enterprise, media streaming, and security. Their network has servers inside ISPs — literally inside the ISP's data center, as close to end users as physically possible.

Netflix Open Connect:
Netflix's proprietary CDN. Rather than using commercial CDNs, Netflix deploys their own hardware appliances inside ISP networks. When ISPs partner with Netflix, they get free CDN hardware that Netflix manages. In return, Netflix content never travels over expensive transit links. Estimated to save Netflix hundreds of millions per year in bandwidth costs.


Interview Scenario: Pull vs Push CDN Decision

"You're designing YouTube. Would you use a Pull or Push CDN?"

The answer that impresses:

"It depends on the content type — and YouTube actually uses both.

For long-tail content (the 99% of videos that get few views), Pull CDN makes sense. It's wasteful to pre-push a video to 200+ edge locations when it'll only ever be watched in one region. Pull CDN caches it locally on first request.

For popular content — new Drake album, viral videos, major sporting events — YouTube pre-positions content using Push CDN before peak traffic hits. They know a new Taylor Swift video will get 10 million views in the first hour. They push it to all edges in advance so the first million concurrent viewers don't all miss cache and hammer origin.

The switching logic is based on predicted popularity, which their ML models estimate from early engagement signals. Content that trends into the top 0.1% by viewcount gets promoted to Push behavior automatically."


Key Takeaways

  • CDNs solve the physics problem: bringing content geographically close to users.
  • Pull CDN: Lazy caching on first request. Simple, efficient for unpredictable traffic. First request is slow.
  • Push CDN: Pre-positioned content before users request it. No cold start, best for known popular content.
  • Origin shielding reduces origin load by funneling all edge cache misses through a single shield PoP.
  • Edge computing (Cloudflare Workers) lets you run code at CDN edge — authentication, A/B testing, personalization — without ever hitting origin.
  • Multi-CDN protects against CDN outages (Fastly 2021). Adds operational complexity but is essential at scale.
  • Cache invalidation: TTL for tolerance, purge API for urgency, URL fingerprinting for JS/CSS assets.
  • CDNs are so deeply embedded in internet infrastructure that a single CDN outage can take down a significant slice of the web.

What's Next

Topic 12 closes Day 4 with API Design and API Gateways — REST vs GraphQL vs gRPC, the Backend for Frontend pattern, and how API Gateways handle authentication, rate limiting, and routing at the edge of your microservices architecture.



Tags: system-design cdn networking backend performance distributed-systems web-development

Top comments (0)