DEV Community

Sreya Satheesh
Sreya Satheesh

Posted on

SD #001 - Designing a Content Delivery Network (CDN)

Designing a Content Delivery Network (CDN) is a very popular system design interview question. It tests your understanding of distributed systems, caching, networking, scalability, trade-offs.

Let’s break it down slowly and clearly.

Why Do We Even Need a CDN?

Imagine this. Your backend server is deployed in one data center in the US.

Now your users are:

  • In India
  • In Europe
  • In Australia
  • In South America

Every time someone loads your website:

  1. The request travels thousands of kilometers.
  2. The server processes it.
  3. The response travels all the way back.

That distance causes:

  • High latency
  • Slow page loads
  • Buffering videos
  • Poor user experience

Now multiply that by millions of users.

Clearly, this won’t scale well.

So the idea is simple:

Instead of moving users closer to the server, move the content closer to the users.

That’s what a CDN does.

What Is a CDN?

A Content Delivery Network (CDN) is a distributed network of servers deployed across multiple geographic locations.

These servers:

  • Cache content
  • Serve users from nearby locations
  • Reduce load on the main server (origin)

Think of it like:
Instead of one big supermarket in one city, we open smaller stores in every city. People buy from the nearest store.

Step 1: Clarify Requirements

Before drawing architecture, always clarify.

Functional Requirements

  1. Serve static content (images, CSS, JS, videos).
  2. Reduce latency for global users.
  3. Cache content at multiple geographic locations.
  4. Fetch content from origin on cache miss.
  5. Support cache invalidation.

Optional:

  • Video streaming support
  • Analytics (hit/miss ratio)
  • DDoS protection
  • TLS termination

Non-Functional Requirements

  1. Very low latency (milliseconds)
  2. High availability
  3. Massive scalability (billions of requests)
  4. Fault tolerance
  5. Cost efficiency

Step 2: High-Level Architecture

Here’s the overall idea:

User → DNS → Nearest Edge Server
         ↓
       (Cache Hit?)
         ↓
      Yes → Return content
      No → Fetch from Origin → Cache → Return

Now let’s understand each component properly.

Step 3: Core Components Explained

DNS Routing

When a user enters www.example.com,
DNS does something smart.

Instead of returning the IP of the origin server,
it returns the IP of the closest edge server.

How does it decide?

  • Geo-based routing (based on user location)
  • Latency-based routing (based on network speed)
  • Anycast routing (same IP announced globally)

The goal:

Send the user to the nearest server

Edge Servers (PoPs)

PoP = Point of Presence.

These are servers deployed in:

  • Different cities
  • Different countries
  • Different continents

Each PoP:

  • Stores cached content
  • Serves users directly
  • Reduces origin traffic

Now two cases happen.

Case 1: Cache Hit

The requested content is already stored in that edge server.

So:

Edge → Immediately returns content

This is fast.

Latency becomes extremely low.

Case 2: Cache Miss

The content is not available at the edge.

So:

  • Edge requests content from origin.
  • Origin sends content.
  • Edge stores it locally.
  • Edge returns it to the user.

Next user in that region gets it instantly.

Step 4: How Caching Works

Caching is the heart of CDN. Without caching, CDN has no meaning.

TTL (Time To Live)

Each cached object has a TTL.

Example:

  • TTL = 1 hour
  • For 1 hour, edge serves cached version.
  • After expiry, edge fetches fresh version.

Trade-off:

  • Long TTL → Better performance, but risk stale content
  • Short TTL → Fresh content, but more load on origin

Cache Eviction Policies

Edge servers have limited memory.

When cache becomes full, we remove old content.

Common policies:

  • LRU (Least Recently Used)
  • LFU (Least Frequently Used)

Most systems use LRU because it’s simple and effective.

Step 5: Scaling the CDN

Let’s assume:

  • 1 billion requests per day
  • 95% cache hit ratio

That means:

Only 5% of traffic goes to origin. That’s massive load reduction.

To scale:

  • Add more PoPs globally
  • Horizontal scaling inside PoPs
  • Use load balancers within edge clusters
  • Use consistent hashing to distribute traffic

CDNs scale horizontally. Never vertically.

Step 6: Handling Failures

What if:

An edge server crashes? A whole region goes down?

Solutions:

  • Health checks
  • Automatic failover
  • Traffic rerouting
  • Multi-region redundancy

Users should not even notice failures.
High availability is critical.

Step 7: Cache Invalidation

If content changes at origin, how do we update it everywhere?

Two approaches:

  1. TTL-based expiration
  2. Active purge (invalidate via API)

Hard because:

  • Data is distributed globally
  • Consistency becomes challenging
  • You must avoid serving stale content

There is no perfect solution. It’s always a trade-off.

Step 8: Trade-Offs

Top comments (0)