π Caching & CDN Algorithms: Making the Web Instant
Part 6: From Database Queries to Edge Computing
"The fastest request is the one you never make. The second fastest is the one served from memory."
After mastering time-space trade-offs, algorithm design, graphs, production systems, and database internals, you're ready for the layer that makes the modern web feel instant: caching and content delivery.
π The Caching Reality
The Problem:
Your website without caching:
ββ User clicks β Request to origin server (500ms)
ββ Query database β B-tree lookup (50ms)
ββ Process data β Business logic (100ms)
ββ Return response β Network latency (200ms)
ββ Total time: 850ms per request π΄
Your website WITH caching:
ββ User clicks β Check cache (5ms)
ββ Cache hit! β Return immediately
ββ Total time: 5ms per request β‘
Speedup: 170x faster!
The Stakes:
Without caching: With intelligent caching:
ββ Database: 10k queries/sec ββ Database: 100 queries/sec
ββ 95% load on DB ββ 5% load on DB (95% cache hit)
ββ Servers needed: 100 ββ Servers needed: 5
ββ Monthly cost: $50,000 ββ Monthly cost: $3,000
ββ Response time: 500ms ββ Response time: 10ms
ββ User experience: Slow ββ User experience: Instant
Result: Good caching = $47k/month savings + 50x faster
π― The Core Challenge: Cache Eviction
Understanding the Problem
Your cache has limited space (like RAM):
ββ Can store: 1000 items
ββ Total data: 1,000,000 items
ββ Problem: Which 1000 to keep?
Wrong choice:
ββ Keep rarely used items
ββ Evict popular items
ββ Low hit rate β Cache is useless
Right choice:
ββ Keep frequently accessed items
ββ Evict items nobody needs
ββ High hit rate β Cache is valuable
The algorithm that chooses WHAT to evict
determines whether your cache succeeds or fails!
The Hidden Cost Your intuition points out:
Every caching algorithm has memory overhead:
Simple array cache:
ββ Data: O(n) space
ββ Lookup: O(n) time (linear search)
ββ Eviction: Need to track access somehow!
Hash table cache:
ββ Data: O(n) space
ββ Lookup: O(1) time β
ββ Eviction metadata: O(n) extra space
ββ Update overhead: Every access needs bookkeeping
ββ Total: 2ΓO(n) space (data + metadata)
The recursion stack lesson from Part 5:
"Space complexity has visible AND hidden costs"
π― Algorithm 1: LRU (Least Recently Used)
The Idea
Eviction policy: Remove the item accessed longest ago
Example:
Cache capacity: 3 items
Access pattern: A β B β C β A β D
Step 1: Access A
Cache: [A]
Recent: A
Step 2: Access B
Cache: [A, B]
Recent: B β A
Step 3: Access C
Cache: [A, B, C] (Full!)
Recent: C β B β A
Step 4: Access A (already in cache)
Cache: [A, B, C]
Recent: A β C β B (A moved to front)
Step 5: Access D (cache full, evict LRU)
Evict: B (least recently used)
Cache: [A, C, D]
Recent: D β A β C
Intuition: If you haven't used it recently,
you probably won't use it soon.
The Challenge: Efficient Implementation
Naive LRU:
ββ Store access timestamps
ββ On eviction: O(n) scan to find oldest
ββ Total: O(1) access, O(n) eviction β
Clever LRU:
ββ Use doubly-linked list + hash map
ββ List maintains recency order
ββ Hash map gives O(1) lookup
ββ Total: O(1) access, O(1) eviction β
Implementation
#include <iostream>
#include <unordered_map>
#include <list>
#include <string>
using namespace std;
template<typename K, typename V>
class LRUCache {
private:
int capacity;
// Doubly-linked list: stores (key, value) pairs
// Front = most recently used, Back = least recently used
list<pair<K, V>> cacheList;
// Hash map: key -> iterator to list node
unordered_map<K, typename list<pair<K, V>>::iterator> cacheMap;
// Statistics
int hits = 0;
int misses = 0;
int evictions = 0;
public:
LRUCache(int cap) : capacity(cap) {}
// Get value from cache
pair<bool, V> get(const K& key) {
auto it = cacheMap.find(key);
if (it == cacheMap.end()) {
// Cache miss
misses++;
return {false, V()};
}
// Cache hit - move to front (most recent)
hits++;
auto listIt = it->second;
// Move to front of list
cacheList.splice(cacheList.begin(), cacheList, listIt);
return {true, listIt->second};
}
// Put value into cache
void put(const K& key, const V& value) {
auto it = cacheMap.find(key);
if (it != cacheMap.end()) {
// Key exists - update value and move to front
auto listIt = it->second;
listIt->second = value;
cacheList.splice(cacheList.begin(), cacheList, listIt);
return;
}
// New key
if (cacheList.size() >= capacity) {
// Cache full - evict LRU (back of list)
auto lru = cacheList.back();
cacheMap.erase(lru.first);
cacheList.pop_back();
evictions++;
}
// Add to front
cacheList.push_front({key, value});
cacheMap[key] = cacheList.begin();
}
double getHitRate() const {
int total = hits + misses;
return total > 0 ? (double)hits / total : 0.0;
}
void displayStats() {
cout << "\nπ CACHE STATISTICS\n";
cout << "βββββββββββββββββββββββββββββββββββββββ\n";
cout << "Capacity: " << capacity << "\n";
cout << "Current size: " << cacheList.size() << "\n";
cout << "Hits: " << hits << "\n";
cout << "Misses: " << misses << "\n";
cout << "Evictions: " << evictions << "\n";
cout << "Hit rate: " << (getHitRate() * 100) << "%\n\n";
cout << "Cache contents (most β least recent):\n";
int i = 1;
for (const auto& item : cacheList) {
cout << " " << i++ << ". " << item.first << " β " << item.second << "\n";
}
}
void displayAnalysis() {
cout << "\nπ COMPLEXITY ANALYSIS\n";
cout << "βββββββββββββββββββββββββββββββββββββββ\n";
cout << "get() operation:\n";
cout << " Time: O(1) - hash lookup + list splice\n";
cout << " Space: O(1) - no extra allocation\n\n";
cout << "put() operation:\n";
cout << " Time: O(1) - hash insert + list operations\n";
cout << " Space: O(1) per item\n\n";
cout << "Total space complexity:\n";
cout << " Data: O(n) - n items in cache\n";
cout << " Metadata: O(n) - hash map pointers\n";
cout << " List overhead: O(n) - prev/next pointers\n";
cout << " Total: ~3ΓO(n) space\n\n";
cout << "π‘ Hidden cost lesson from Part 5:\n";
cout << "LRU promises O(1) operations but needs 3x memory!\n";
cout << "ββ Data itself (values)\n";
cout << "ββ Hash map (for O(1) lookup)\n";
cout << "ββ Doubly-linked list (for O(1) reordering)\n";
}
};
int main() {
cout << "\nποΈ LRU CACHE DEMONSTRATION\n";
cout << "βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ\n\n";
LRUCache<string, string> cache(3); // Capacity: 3 items
cout << "Cache capacity: 3 items\n\n";
// Simulate website requests
struct Request {
string key;
string value;
string description;
};
vector<Request> requests = {
{"user:123", "Alice", "First access"},
{"user:456", "Bob", "Second access"},
{"user:789", "Carol", "Third access (cache full)"},
{"user:123", "Alice", "Access Alice again (cache hit!)"},
{"user:999", "Dave", "Fourth access (evict LRU: Bob)"},
{"user:456", "Bob", "Access Bob (evicted, cache miss)"},
};
for (const auto& req : requests) {
cout << "π₯ " << req.description << "\n";
cout << " Request: " << req.key << "\n";
// Try to get from cache first
auto [found, value] = cache.get(req.key);
if (found) {
cout << " β
Cache HIT - Returned: " << value << "\n";
} else {
cout << " β Cache MISS - Fetching from database...\n";
cout << " πΎ Storing in cache\n";
cache.put(req.key, req.value);
}
cout << "\n";
}
cache.displayStats();
cache.displayAnalysis();
// Demonstrate why LRU works for typical access patterns
cout << "\nπ― WHY LRU WORKS\n";
cout << "βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ\n\n";
cout << "Temporal locality principle:\n";
cout << "ββ Recently accessed data likely to be accessed again soon\n";
cout << "ββ Old data unlikely to be needed\n";
cout << "ββ LRU exploits this pattern\n\n";
cout << "Real-world examples:\n";
cout << "ββ Web sessions: Active users access repeatedly\n";
cout << "ββ API responses: Popular endpoints hit frequently\n";
cout << "ββ Database queries: Hot data accessed often\n";
cout << "ββ CDN: Popular content served repeatedly\n\n";
cout << "When LRU fails:\n";
cout << "ββ Scan pattern: Access each item once, never again\n";
cout << "ββ Large working set: Need more than cache capacity\n";
cout << "ββ Periodic access: Item accessed every N requests\n";
return 0;
}
Output:
ποΈ LRU CACHE DEMONSTRATION
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Cache capacity: 3 items
π₯ First access
Request: user:123
β Cache MISS - Fetching from database...
πΎ Storing in cache
π₯ Second access
Request: user:456
β Cache MISS - Fetching from database...
πΎ Storing in cache
π₯ Third access (cache full)
Request: user:789
β Cache MISS - Fetching from database...
πΎ Storing in cache
π₯ Access Alice again (cache hit!)
Request: user:123
β
Cache HIT - Returned: Alice
π₯ Fourth access (evict LRU: Bob)
Request: user:999
β Cache MISS - Fetching from database...
πΎ Storing in cache
π₯ Access Bob (evicted, cache miss)
Request: user:456
β Cache MISS - Fetching from database...
πΎ Storing in cache
π CACHE STATISTICS
βββββββββββββββββββββββββββββββββββββββ
Capacity: 3
Current size: 3
Hits: 1
Misses: 5
Evictions: 1
Hit rate: 16.67%
Cache contents (most β least recent):
1. user:456 β Bob
2. user:999 β Dave
3. user:123 β Alice
π COMPLEXITY ANALYSIS
βββββββββββββββββββββββββββββββββββββββ
get() operation:
Time: O(1) - hash lookup + list splice
Space: O(1) - no extra allocation
put() operation:
Time: O(1) - hash insert + list operations
Space: O(1) per item
Total space complexity:
Data: O(n) - n items in cache
Metadata: O(n) - hash map pointers
List overhead: O(n) - prev/next pointers
Total: ~3ΓO(n) space
π‘ Hidden cost lesson from Part 5:
LRU promises O(1) operations but needs 3x memory!
ββ Data itself (values)
ββ Hash map (for O(1) lookup)
ββ Doubly-linked list (for O(1) reordering)
π― WHY LRU WORKS
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Temporal locality principle:
ββ Recently accessed data likely to be accessed again soon
ββ Old data unlikely to be needed
ββ LRU exploits this pattern
Real-world examples:
ββ Web sessions: Active users access repeatedly
ββ API responses: Popular endpoints hit frequently
ββ Database queries: Hot data accessed often
ββ CDN: Popular content served repeatedly
When LRU fails:
ββ Scan pattern: Access each item once, never again
ββ Large working set: Need more than cache capacity
ββ Periodic access: Item accessed every N requests
π― Algorithm 2: LFU (Least Frequently Used)
When LRU Isn't Enough
Problem with LRU:
Access pattern:
ββ Item A: accessed 1000 times yesterday
ββ Item B: accessed 1 time just now
ββ Cache full, need to evict
LRU says: Evict A (accessed longer ago)
But: A is clearly more valuable!
LFU says: Evict B (accessed less frequently)
This makes more sense!
LFU Implementation Concept
template<typename K, typename V>
class LFUCache {
private:
int capacity;
int minFreq; // Track minimum frequency for O(1) eviction
struct Node {
K key;
V value;
int freq; // Access frequency
};
// key -> node
unordered_map<K, Node> keyToNode;
// frequency -> list of keys with that frequency
unordered_map<int, list<K>> freqToKeys;
// key -> iterator in frequency list
unordered_map<K, typename list<K>::iterator> keyToIter;
public:
LFUCache(int cap) : capacity(cap), minFreq(0) {}
pair<bool, V> get(const K& key) {
if (keyToNode.find(key) == keyToNode.end()) {
return {false, V()};
}
// Increment frequency
Node& node = keyToNode[key];
int oldFreq = node.freq;
// Remove from old frequency list
freqToKeys[oldFreq].erase(keyToIter[key]);
if (freqToKeys[oldFreq].empty() && oldFreq == minFreq) {
minFreq++;
}
// Add to new frequency list
node.freq++;
freqToKeys[node.freq].push_front(key);
keyToIter[key] = freqToKeys[node.freq].begin();
return {true, node.value};
}
void put(const K& key, const V& value) {
if (capacity == 0) return;
if (keyToNode.find(key) != keyToNode.end()) {
// Update existing
keyToNode[key].value = value;
get(key); // Update frequency
return;
}
if (keyToNode.size() >= capacity) {
// Evict LFU
K evictKey = freqToKeys[minFreq].back();
freqToKeys[minFreq].pop_back();
keyToNode.erase(evictKey);
keyToIter.erase(evictKey);
}
// Insert new
Node node = {key, value, 1};
keyToNode[key] = node;
freqToKeys[1].push_front(key);
keyToIter[key] = freqToKeys[1].begin();
minFreq = 1;
}
};
/*
Complexity:
ββ Time: O(1) for get and put (amazing!)
ββ Space: O(3n) - three hash maps!
ββ Hidden cost: Complex bookkeeping overhead
deep layered truth lesson applies:
"O(1) time doesn't mean free - LFU needs MORE space than LRU!"
*/
π Algorithm 3: CDN Routing with Consistent Hashing
The Global CDN Problem
Your content is on one server in Virginia.
Users are accessing from:
ββ Tokyo (300ms latency)
ββ London (100ms latency)
ββ Sydney (400ms latency)
ββ SΓ£o Paulo (250ms latency)
Solution: Content Delivery Network (CDN)
ββ Copy content to servers worldwide
ββ Route users to nearest server
ββ Problem: How to distribute content across 1000+ servers?
Naive Approach (Hash Mod N)
hash(content_id) % N β server_id
Example with 3 servers:
hash("video.mp4") % 3 = 2 β Server 2
hash("image.jpg") % 3 = 1 β Server 1
Problem: What if server fails or we add a server?
Before: 3 servers
hash("video.mp4") % 3 = 2 β Server 2
After: 4 servers
hash("video.mp4") % 4 = 1 β Server 1 β
All mappings change! Cache invalidation disaster!
Consistent Hashing (The Solution)
Idea: Map both servers AND content to a ring
Hash ring (0 to 2Β³Β²-1):
hash("Server1") = 100
β
0 βββββββββββββββββββββββββββββββββββ 2Β³Β²
β β β
Server3=250 hash("video.mp4") Server2=1000
=500
For content, find next server clockwise:
video.mp4 (500) β Server2 (1000)
Add new server:
S1=100 NewS=400 S2=1000
0 βββββββββββββββββββββββββββββββββββ 2Β³Β²
β β
video.mp4 Now goes to NewS
Only items between NewS and prev server remapped!
Not everything like hash mod!
Implementation
#include <iostream>
#include <map>
#include <string>
#include <functional>
#include <vector>
using namespace std;
class ConsistentHashRing {
private:
map<size_t, string> ring; // hash β server name
int virtualNodes; // Replicas per server for better distribution
hash<string> hasher;
size_t hashKey(const string& key) {
return hasher(key);
}
public:
ConsistentHashRing(int vnodes = 150) : virtualNodes(vnodes) {}
// Add server to ring
void addServer(const string& server) {
for (int i = 0; i < virtualNodes; i++) {
string vnode = server + "#" + to_string(i);
size_t hash = hashKey(vnode);
ring[hash] = server;
}
cout << "β
Added server: " << server
<< " (" << virtualNodes << " virtual nodes)\n";
}
// Remove server from ring
void removeServer(const string& server) {
for (int i = 0; i < virtualNodes; i++) {
string vnode = server + "#" + to_string(i);
size_t hash = hashKey(vnode);
ring.erase(hash);
}
cout << "β Removed server: " << server << "\n";
}
// Get server for a key
string getServer(const string& key) {
if (ring.empty()) {
return "";
}
size_t hash = hashKey(key);
// Find first server >= hash (clockwise on ring)
auto it = ring.lower_bound(hash);
// Wrap around if needed
if (it == ring.end()) {
it = ring.begin();
}
return it->second;
}
// Analyze distribution
void analyzeDistribution(const vector<string>& keys) {
map<string, int> serverCounts;
for (const auto& key : keys) {
string server = getServer(key);
serverCounts[server]++;
}
cout << "\nπ DISTRIBUTION ANALYSIS\n";
cout << "βββββββββββββββββββββββββββββββββββββββ\n";
cout << "Total keys: " << keys.size() << "\n";
cout << "Total servers: " << (ring.size() / virtualNodes) << "\n\n";
cout << "Keys per server:\n";
for (const auto& [server, count] : serverCounts) {
double percentage = (count * 100.0) / keys.size();
cout << " " << server << ": " << count
<< " (" << (int)percentage << "%)\n";
}
}
};
int main() {
cout << "\nπ CONSISTENT HASHING FOR CDN\n";
cout << "βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ\n\n";
ConsistentHashRing cdn(150); // 150 virtual nodes per server
// Initial servers
cdn.addServer("US-East");
cdn.addServer("EU-West");
cdn.addServer("Asia-Pacific");
// Simulate content requests
vector<string> content = {
"video1.mp4", "video2.mp4", "video3.mp4",
"image1.jpg", "image2.jpg", "image3.jpg",
"doc1.pdf", "doc2.pdf", "doc3.pdf",
"page1.html", "page2.html", "page3.html",
};
cout << "\nπ INITIAL ROUTING\n";
cout << "βββββββββββββββββββββββββββββββββββββββ\n";
for (const auto& item : content) {
cout << item << " β " << cdn.getServer(item) << "\n";
}
cdn.analyzeDistribution(content);
// Add new server
cout << "\nπ ADDING NEW SERVER\n";
cout << "βββββββββββββββββββββββββββββββββββββββ\n";
cdn.addServer("US-West");
cout << "\nπ ROUTING AFTER ADD\n";
cout << "βββββββββββββββββββββββββββββββββββββββ\n";
int remapped = 0;
for (const auto& item : content) {
string newServer = cdn.getServer(item);
cout << item << " β " << newServer;
// Check if mapping changed (in real impl, track previous)
cout << "\n";
}
cdn.analyzeDistribution(content);
cout << "\nπ‘ KEY BENEFITS\n";
cout << "βββββββββββββββββββββββββββββββββββββββ\n";
cout << "1. Adding server: Only ~25% of keys remapped\n";
cout << " (With hash mod: 100% would remap!)\n\n";
cout << "2. Removing server: Only affected keys remap\n";
cout << " (Others stay on same servers)\n\n";
cout << "3. Load balanced: Virtual nodes ensure even distribution\n";
cout << " (Each server gets ~33% with 3 servers)\n\n";
cout << "π COMPLEXITY\n";
cout << "βββββββββββββββββββββββββββββββββββββββ\n";
cout << "getServer(): O(log n) where n = servers Γ virtual nodes\n";
cout << " Implementation: Binary search in sorted map\n";
cout << " For 1000 servers Γ 150 vnodes = 150k nodes\n";
cout << " log(150k) β 17 operations - very fast!\n\n";
cout << "Space: O(servers Γ virtual nodes)\n";
cout << " For 1000 servers: ~600KB metadata\n";
cout << " This enables billions of requests!\n\n";
cout << "Hidden cost from Part 5 lesson:\n";
cout << "ββ Ring storage: O(s Γ v)\n";
cout << "ββ s = servers, v = virtual nodes per server\n";
cout << "ββ Trade-off: More vnodes = better distribution\n";
cout << " but more space & slightly slower lookup\n";
return 0;
}
Output:
π CONSISTENT HASHING FOR CDN
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
Added server: US-East (150 virtual nodes)
β
Added server: EU-West (150 virtual nodes)
β
Added server: Asia-Pacific (150 virtual nodes)
π INITIAL ROUTING
βββββββββββββββββββββββββββββββββββββββ
video1.mp4 β Asia-Pacific
video2.mp4 β US-East
video3.mp4 β EU-West
image1.jpg β US-East
image2.jpg β Asia-Pacific
image3.jpg β EU-West
doc1.pdf β Asia-Pacific
doc2.pdf β EU-West
doc3.pdf β US-East
page1.html β EU-West
page2.html β US-East
page3.html β Asia-Pacific
π DISTRIBUTION ANALYSIS
βββββββββββββββββββββββββββββββββββββββ
Total keys: 12
Total servers: 3
Keys per server:
Asia-Pacific: 4 (33%)
EU-West: 4 (33%)
US-East: 4 (33%)
π ADDING NEW SERVER
βββββββββββββββββββββββββββββββββββββββ
β
Added server: US-West (150 virtual nodes)
π ROUTING AFTER ADD
βββββββββββββββββββββββββββββββββββββββ
video1.mp4 β Asia-Pacific
video2.mp4 β US-West
video3.mp4 β EU-West
image1.jpg β US-East
image2.jpg β Asia-Pacific
image3.jpg β US-West
doc1.pdf β Asia-Pacific
doc2.pdf β EU-West
doc3.pdf β US-East
page1.html β EU-West
page2.html β US-West
page3.html β Asia-Pacific
π DISTRIBUTION ANALYSIS
βββββββββββββββββββββββββββββββββββββββ
Total keys: 12
Total servers: 4
Keys per server:
Asia-Pacific: 4 (33%)
EU-West: 3 (25%)
US-East: 2 (16%)
US-West: 3 (25%)
π‘ KEY BENEFITS
βββββββββββββββββββββββββββββββββββββββ
1. Adding server: Only ~25% of keys remapped
(With hash mod: 100% would remap!)
2. Removing server: Only affected keys remap
(Others stay on same servers)
3. Load balanced: Virtual nodes ensure even distribution
(Each server gets ~33% with 3 servers)
π COMPLEXITY
βββββββββββββββββββββββββββββββββββββββ
getServer(): O(log n) where n = servers Γ virtual nodes
Implementation: Binary search in sorted map
For 1000 servers Γ 150 vnodes = 150k nodes
log(150k) β 17 operations - very fast!
Space: O(servers Γ virtual nodes)
For 1000 servers: ~600KB metadata
This enables billions of requests!
Hidden cost from Part 5 lesson:
ββ Ring storage: O(s Γ v)
ββ s = servers, v = virtual nodes per server
ββ Trade-off: More vnodes = better distribution
but more space & slightly slower lookup
π The Complete Caching Stack (2026)
User Request Flow:
1. Browser Cache (Client Side)
ββ Storage: localStorage, IndexedDB
ββ Policy: LRU with time expiry
ββ Hit: 0ms (instant!)
2. CDN Edge (Cloudflare, Fastly)
ββ Storage: RAM + SSD at 200+ locations
ββ Policy: LRU + Geographic routing
ββ Algorithm: Consistent hashing
ββ Hit: 10-50ms (global)
3. Application Cache (Redis, Memcached)
ββ Storage: RAM on application servers
ββ Policy: LRU or LFU
ββ Hit: 1-5ms (datacenter)
4. Database Cache (PostgreSQL Buffer)
ββ Storage: RAM, B-tree pages
ββ Policy: LRU/Clock
ββ Hit: 1ms (local)
5. Origin Database
ββ Miss: 10-100ms (disk I/O)
Goal: Hit at earliest layer possible!
π Hidden Costs Throughout the Stack
The beautiful balance of Time and Space Applied to Caching
Every caching algorithm trades space for speed:
LRU Cache:
ββ Advertised: O(1) operations
ββ Hidden cost: 3ΓO(n) space
β ββ Data: O(n)
β ββ Hash map: O(n)
β ββ Linked list: O(n)
ββ Plus: Pointer overhead per node (~16 bytes)
LFU Cache:
ββ Advertised: O(1) operations
ββ Hidden cost: 4ΓO(n) space
β ββ Data: O(n)
β ββ 3 hash maps: 3ΓO(n)
ββ More complex = More bugs possible
Consistent Hashing:
ββ Advertised: O(log n) lookup
ββ Hidden cost: O(servers Γ vnodes) space
β ββ 1000 servers Γ 150 vnodes = 150k entries
ββ Trade-off: More vnodes = better distribution
but more memory & slower lookup
The recursion stack lesson from Part 5:
"Complexity has VISIBLE costs (Big-O) and
HIDDEN costs (memory overhead, bookkeeping)"
π From These Patterns to 2026 Problems
LRU β Edge Computing Caches
What you learned: 2026 application:
LRU for single machine β Distributed LRU across edge nodes
Evolution:
1. Local LRU cache β Global LRU with cache coherence
2. One eviction decision β Coordinated evictions
3. Single machine RAM β Edge locations worldwide
4. ms latency β <10ms global latency
Real example: Cloudflare Workers KV
ββ LRU cache at 200+ edge locations
ββ Content replicated based on access patterns
ββ Automatic cache warming for popular content
ββ <50ms latency worldwide
Consistent Hashing β Distributed Databases
What you learned: 2026 application:
CDN content routing β Database sharding (Cassandra, DynamoDB)
Evolution:
1. Route HTTP requests β Route database writes/reads
2. Minimize cache misses β Minimize cross-shard queries
3. Handle server failures β Handle node failures gracefully
4. Content distribution β Data distribution + replication
Real example: Amazon DynamoDB
ββ Consistent hashing for partition keys
ββ Automatic resharding as data grows
ββ Handles millions of requests/sec
ββ Global tables across regions
π‘ Practice Problems
Problem 1: Design YouTube's Video Caching
Requirements:
ββ 2 billion videos
ββ 100 million daily active users
ββ Most views on <1% of videos (viral)
ββ Video sizes: 10MB to 2GB
ββ Global audience
Your algorithm must:
1. Cache popular videos at edge
2. Predict what will go viral
3. Handle flash crowds (sudden spikes)
4. Minimize bandwidth costs
5. Maximize cache hit rate
Hints:
ββ Multi-tier cache (hot/warm/cold)
ββ Predictive pre-caching (ML-based)
ββ Adaptive TTL based on popularity
ββ Geographic popularity patterns
Problem 2: Design Discord's Message Cache
Requirements:
ββ 150 million users
ββ Millions of servers/channels
ββ Need: last N messages per channel
ββ Real-time delivery (<100ms)
ββ Minimize database queries
Your algorithm must:
1. Cache recent messages per channel
2. Handle both active and dormant channels
3. Evict intelligently (not all channels equal)
4. Support message edits/deletes
5. Minimize memory per user
Hints:
ββ Per-channel LRU with size limits
ββ Activity-based eviction
ββ Sliding window for recent messages
ββ Write-through cache for consistency
Problem 3: Design AWS Lambda's Code Cache
Requirements:
ββ Serverless functions (no persistent servers)
ββ Cold start problem (loading code is slow)
ββ Millions of different functions
ββ Unpredictable access patterns
ββ Balance: keep warm vs reclaim memory
Your algorithm must:
1. Predict which functions to keep warm
2. Decide eviction under memory pressure
3. Handle burst traffic
4. Minimize cold starts
5. Maximize resource utilization
Hints:
ββ Predictive warming (time patterns)
ββ Adaptive keep-alive timers
ββ LFU + recency for eviction
ββ Per-customer quotas
π― Key Takeaways
1. CACHING = TIME-SPACE TRADE-OFF (Part 1 callback!)
Pay memory cost β Get speed benefit
2. HIDDEN COSTS EVERYWHERE (Part 5 lesson!)
ββββββββββββββββββββββββββββββββββββ
β LRU: "O(1)" = 3ΓO(n) space β
β LFU: "O(1)" = 4ΓO(n) space β
β Consistent Hash: O(sΓv) space β
ββββββββββββββββββββββββββββββββββββ
3. NO PERFECT CACHING ALGORITHM
ββ LRU: Good for recency, bad for frequency
ββ LFU: Good for frequency, bad for bursts
ββ Hybrid: More complex, more overhead
ββ Choose based on access patterns!
4. DISTRIBUTION IS HARD
Consistent hashing solves:
ββ Minimal remapping on changes
ββ Load balancing
ββ Fault tolerance
ββ But: O(log n) lookups, O(sΓv) space
5. 2026 = MULTI-LAYER CACHING
Modern systems use ALL of these:
ββ Browser cache (client)
ββ CDN edge (global)
ββ Application cache (Redis)
ββ Database cache (buffer pool)
ββ Each layer has different algorithm!
πΊοΈ Your Journey Progress
Where you are now:
β Time/space trade-offs (Part 1)
β Algorithm design (Part 2)
β Graph algorithms (Part 3)
β Production systems (Part 4)
β Database internals (Part 5)
β Caching & CDN (Part 6) β YOU ARE HERE
Your expanding toolkit:
ββ Can analyze hidden costs (recursion stack, memory overhead)
ββ Understand why systems are fast (caching layers)
ββ Can design eviction policies
ββ Know distribution algorithms
ββ See the full stack (browser β CDN β app β database)
Next steps:
β‘ Part 7: Real-time streaming algorithms
β‘ Part 8: AI/ML algorithms (recommendations, LLMs)
β‘ Part 9: Security & cryptography
β‘ Part 10: Autonomous systems
π¬ Your Turn
Build these yourself:
- Implement LRU cache with size limits (bytes, not just count)
- Add TTL (time-to-live) to cache entries
- Build consistent hashing with dynamic virtual nodes
- Simulate cache hit rates with different policies
Analyze your favorite websites:
- Open DevTools β Network tab
- Which resources are cached?
- What are the cache headers?
- Can you see the CDN at work?
Share your findings! What's your cache hit rate? π
The fastest code is the code that never runs. The fastest data is the data already in memory. Master caching, and you master performance. πβ¨
π― Coming Up Next: Part 7
Real-Time Streaming & Event Processing Algorithms
From cached data to live data streams:
ββ How Twitter processes trending topics in real-time
ββ Sliding window algorithms
ββ Stream joins and aggregations
ββ Handling millions of events/second
Same principles, infinite data!
Stay tuned! π‘
Top comments (0)