Sreya Satheesh

Posted on Mar 2

SD #001 - Designing a Content Delivery Network (CDN)

#systemdesign #distributedsystems #backendengineering #scalablesystems

Designing a Content Delivery Network (CDN) is a very popular system design interview question. It tests your understanding of distributed systems, caching, networking, scalability, trade-offs.

Let’s break it down slowly and clearly.

Why Do We Even Need a CDN?

Imagine this. Your backend server is deployed in one data center in the US.

Now your users are:

In India
In Europe
In Australia
In South America

Every time someone loads your website:

The request travels thousands of kilometers.
The server processes it.
The response travels all the way back.

That distance causes:

High latency
Slow page loads
Buffering videos
Poor user experience

Now multiply that by millions of users.

Clearly, this won’t scale well.

So the idea is simple:

Instead of moving users closer to the server, move the content closer to the users.

That’s what a CDN does.

What Is a CDN?

A Content Delivery Network (CDN) is a distributed network of servers deployed across multiple geographic locations.

These servers:

Cache content
Serve users from nearby locations
Reduce load on the main server (origin)

Think of it like:
Instead of one big supermarket in one city, we open smaller stores in every city. People buy from the nearest store.

Step 1: Clarify Requirements

Before drawing architecture, always clarify.

Functional Requirements

Serve static content (images, CSS, JS, videos).
Reduce latency for global users.
Cache content at multiple geographic locations.
Fetch content from origin on cache miss.
Support cache invalidation.

Optional:

Video streaming support
Analytics (hit/miss ratio)
DDoS protection
TLS termination

Non-Functional Requirements

Very low latency (milliseconds)
High availability
Massive scalability (billions of requests)
Fault tolerance
Cost efficiency

Step 2: High-Level Architecture

Here’s the overall idea:

User → DNS → Nearest Edge Server
↓
(Cache Hit?)
↓
Yes → Return content
No → Fetch from Origin → Cache → Return

Now let’s understand each component properly.

Step 3: Core Components Explained

DNS Routing

When a user enters www.example.com,
DNS does something smart.

Instead of returning the IP of the origin server,
it returns the IP of the closest edge server.

How does it decide?

Geo-based routing (based on user location)
Latency-based routing (based on network speed)
Anycast routing (same IP announced globally)

The goal:

Send the user to the nearest server

Edge Servers (PoPs)

PoP = Point of Presence.

These are servers deployed in:

Different cities
Different countries
Different continents

Each PoP:

Stores cached content
Serves users directly
Reduces origin traffic

Now two cases happen.

Case 1: Cache Hit

The requested content is already stored in that edge server.

So:

Edge → Immediately returns content

This is fast.

Latency becomes extremely low.

Case 2: Cache Miss

The content is not available at the edge.

So:

Edge requests content from origin.
Origin sends content.
Edge stores it locally.
Edge returns it to the user.

Next user in that region gets it instantly.

Step 4: How Caching Works

Caching is the heart of CDN. Without caching, CDN has no meaning.

TTL (Time To Live)

Each cached object has a TTL.

Example:

TTL = 1 hour
For 1 hour, edge serves cached version.
After expiry, edge fetches fresh version.

Trade-off:

Long TTL → Better performance, but risk stale content
Short TTL → Fresh content, but more load on origin

Cache Eviction Policies

Edge servers have limited memory.

When cache becomes full, we remove old content.

Common policies:

LRU (Least Recently Used)
LFU (Least Frequently Used)

Most systems use LRU because it’s simple and effective.

Step 5: Scaling the CDN

Let’s assume:

1 billion requests per day
95% cache hit ratio

That means:

Only 5% of traffic goes to origin. That’s massive load reduction.

To scale:

Add more PoPs globally
Horizontal scaling inside PoPs
Use load balancers within edge clusters
Use consistent hashing to distribute traffic

CDNs scale horizontally. Never vertically.

Step 6: Handling Failures

What if:

An edge server crashes? A whole region goes down?

Solutions:

Health checks
Automatic failover
Traffic rerouting
Multi-region redundancy

Users should not even notice failures.
High availability is critical.

Step 7: Cache Invalidation

If content changes at origin, how do we update it everywhere?

Two approaches:

TTL-based expiration
Active purge (invalidate via API)

Hard because:

Data is distributed globally
Consistency becomes challenging
You must avoid serving stale content

There is no perfect solution. It’s always a trade-off.

Step 8: Trade-Offs

Top comments (0)

The discussion has been locked. New comments can't be added.