DEV Community

Charles Kumar
Charles Kumar

Posted on

πŸš€ The Algorithm Mastery Series ( part 7 )

πŸš€ Caching & CDN Algorithms: Making the Web Instant

Part 6: From Database Queries to Edge Computing

"The fastest request is the one you never make. The second fastest is the one served from memory."

After mastering time-space trade-offs, algorithm design, graphs, production systems, and database internals, you're ready for the layer that makes the modern web feel instant: caching and content delivery.


🌍 The Caching Reality

The Problem:

Your website without caching:
β”œβ”€ User clicks β†’ Request to origin server (500ms)
β”œβ”€ Query database β†’ B-tree lookup (50ms)
β”œβ”€ Process data β†’ Business logic (100ms)
β”œβ”€ Return response β†’ Network latency (200ms)
└─ Total time: 850ms per request 😴

Your website WITH caching:
β”œβ”€ User clicks β†’ Check cache (5ms)
β”œβ”€ Cache hit! β†’ Return immediately
└─ Total time: 5ms per request ⚑

Speedup: 170x faster!
Enter fullscreen mode Exit fullscreen mode

The Stakes:

Without caching:              With intelligent caching:
β”œβ”€ Database: 10k queries/sec  β”œβ”€ Database: 100 queries/sec
β”œβ”€ 95% load on DB             β”œβ”€ 5% load on DB (95% cache hit)
β”œβ”€ Servers needed: 100        β”œβ”€ Servers needed: 5
β”œβ”€ Monthly cost: $50,000      β”œβ”€ Monthly cost: $3,000
β”œβ”€ Response time: 500ms       β”œβ”€ Response time: 10ms
└─ User experience: Slow      └─ User experience: Instant

Result: Good caching = $47k/month savings + 50x faster
Enter fullscreen mode Exit fullscreen mode

🎯 The Core Challenge: Cache Eviction

Understanding the Problem

Your cache has limited space (like RAM):
β”œβ”€ Can store: 1000 items
β”œβ”€ Total data: 1,000,000 items
β”œβ”€ Problem: Which 1000 to keep?

Wrong choice:
β”œβ”€ Keep rarely used items
β”œβ”€ Evict popular items
└─ Low hit rate β†’ Cache is useless

Right choice:
β”œβ”€ Keep frequently accessed items
β”œβ”€ Evict items nobody needs
└─ High hit rate β†’ Cache is valuable

The algorithm that chooses WHAT to evict
determines whether your cache succeeds or fails!
Enter fullscreen mode Exit fullscreen mode

The Hidden Cost Your intuition points out:

Every caching algorithm has memory overhead:

Simple array cache:
β”œβ”€ Data: O(n) space
β”œβ”€ Lookup: O(n) time (linear search)
└─ Eviction: Need to track access somehow!

Hash table cache:
β”œβ”€ Data: O(n) space
β”œβ”€ Lookup: O(1) time βœ“
β”œβ”€ Eviction metadata: O(n) extra space
β”œβ”€ Update overhead: Every access needs bookkeeping
└─ Total: 2Γ—O(n) space (data + metadata)

The recursion stack lesson from Part 5:
"Space complexity has visible AND hidden costs"
Enter fullscreen mode Exit fullscreen mode

🎯 Algorithm 1: LRU (Least Recently Used)

The Idea

Eviction policy: Remove the item accessed longest ago

Example:
Cache capacity: 3 items

Access pattern: A β†’ B β†’ C β†’ A β†’ D

Step 1: Access A
Cache: [A]
Recent: A

Step 2: Access B
Cache: [A, B]
Recent: B β†’ A

Step 3: Access C
Cache: [A, B, C]  (Full!)
Recent: C β†’ B β†’ A

Step 4: Access A (already in cache)
Cache: [A, B, C]
Recent: A β†’ C β†’ B  (A moved to front)

Step 5: Access D (cache full, evict LRU)
Evict: B (least recently used)
Cache: [A, C, D]
Recent: D β†’ A β†’ C

Intuition: If you haven't used it recently,
           you probably won't use it soon.
Enter fullscreen mode Exit fullscreen mode

The Challenge: Efficient Implementation

Naive LRU:
β”œβ”€ Store access timestamps
β”œβ”€ On eviction: O(n) scan to find oldest
└─ Total: O(1) access, O(n) eviction ❌

Clever LRU:
β”œβ”€ Use doubly-linked list + hash map
β”œβ”€ List maintains recency order
β”œβ”€ Hash map gives O(1) lookup
└─ Total: O(1) access, O(1) eviction βœ“
Enter fullscreen mode Exit fullscreen mode

Implementation

#include <iostream>
#include <unordered_map>
#include <list>
#include <string>
using namespace std;

template<typename K, typename V>
class LRUCache {
private:
    int capacity;

    // Doubly-linked list: stores (key, value) pairs
    // Front = most recently used, Back = least recently used
    list<pair<K, V>> cacheList;

    // Hash map: key -> iterator to list node
    unordered_map<K, typename list<pair<K, V>>::iterator> cacheMap;

    // Statistics
    int hits = 0;
    int misses = 0;
    int evictions = 0;

public:
    LRUCache(int cap) : capacity(cap) {}

    // Get value from cache
    pair<bool, V> get(const K& key) {
        auto it = cacheMap.find(key);

        if (it == cacheMap.end()) {
            // Cache miss
            misses++;
            return {false, V()};
        }

        // Cache hit - move to front (most recent)
        hits++;
        auto listIt = it->second;

        // Move to front of list
        cacheList.splice(cacheList.begin(), cacheList, listIt);

        return {true, listIt->second};
    }

    // Put value into cache
    void put(const K& key, const V& value) {
        auto it = cacheMap.find(key);

        if (it != cacheMap.end()) {
            // Key exists - update value and move to front
            auto listIt = it->second;
            listIt->second = value;
            cacheList.splice(cacheList.begin(), cacheList, listIt);
            return;
        }

        // New key
        if (cacheList.size() >= capacity) {
            // Cache full - evict LRU (back of list)
            auto lru = cacheList.back();
            cacheMap.erase(lru.first);
            cacheList.pop_back();
            evictions++;
        }

        // Add to front
        cacheList.push_front({key, value});
        cacheMap[key] = cacheList.begin();
    }

    double getHitRate() const {
        int total = hits + misses;
        return total > 0 ? (double)hits / total : 0.0;
    }

    void displayStats() {
        cout << "\nπŸ“Š CACHE STATISTICS\n";
        cout << "═══════════════════════════════════════\n";
        cout << "Capacity: " << capacity << "\n";
        cout << "Current size: " << cacheList.size() << "\n";
        cout << "Hits: " << hits << "\n";
        cout << "Misses: " << misses << "\n";
        cout << "Evictions: " << evictions << "\n";
        cout << "Hit rate: " << (getHitRate() * 100) << "%\n\n";

        cout << "Cache contents (most β†’ least recent):\n";
        int i = 1;
        for (const auto& item : cacheList) {
            cout << "  " << i++ << ". " << item.first << " β†’ " << item.second << "\n";
        }
    }

    void displayAnalysis() {
        cout << "\nπŸ” COMPLEXITY ANALYSIS\n";
        cout << "═══════════════════════════════════════\n";
        cout << "get() operation:\n";
        cout << "  Time: O(1) - hash lookup + list splice\n";
        cout << "  Space: O(1) - no extra allocation\n\n";

        cout << "put() operation:\n";
        cout << "  Time: O(1) - hash insert + list operations\n";
        cout << "  Space: O(1) per item\n\n";

        cout << "Total space complexity:\n";
        cout << "  Data: O(n) - n items in cache\n";
        cout << "  Metadata: O(n) - hash map pointers\n";
        cout << "  List overhead: O(n) - prev/next pointers\n";
        cout << "  Total: ~3Γ—O(n) space\n\n";

        cout << "πŸ’‘ Hidden cost lesson from Part 5:\n";
        cout << "LRU promises O(1) operations but needs 3x memory!\n";
        cout << "β”œβ”€ Data itself (values)\n";
        cout << "β”œβ”€ Hash map (for O(1) lookup)\n";
        cout << "└─ Doubly-linked list (for O(1) reordering)\n";
    }
};

int main() {
    cout << "\nπŸ—„οΈ LRU CACHE DEMONSTRATION\n";
    cout << "═══════════════════════════════════════════════════════════\n\n";

    LRUCache<string, string> cache(3);  // Capacity: 3 items

    cout << "Cache capacity: 3 items\n\n";

    // Simulate website requests
    struct Request {
        string key;
        string value;
        string description;
    };

    vector<Request> requests = {
        {"user:123", "Alice", "First access"},
        {"user:456", "Bob", "Second access"},
        {"user:789", "Carol", "Third access (cache full)"},
        {"user:123", "Alice", "Access Alice again (cache hit!)"},
        {"user:999", "Dave", "Fourth access (evict LRU: Bob)"},
        {"user:456", "Bob", "Access Bob (evicted, cache miss)"},
    };

    for (const auto& req : requests) {
        cout << "πŸ“₯ " << req.description << "\n";
        cout << "   Request: " << req.key << "\n";

        // Try to get from cache first
        auto [found, value] = cache.get(req.key);

        if (found) {
            cout << "   βœ… Cache HIT - Returned: " << value << "\n";
        } else {
            cout << "   ❌ Cache MISS - Fetching from database...\n";
            cout << "   πŸ’Ύ Storing in cache\n";
            cache.put(req.key, req.value);
        }
        cout << "\n";
    }

    cache.displayStats();
    cache.displayAnalysis();

    // Demonstrate why LRU works for typical access patterns
    cout << "\n🎯 WHY LRU WORKS\n";
    cout << "═══════════════════════════════════════════════════════════\n\n";
    cout << "Temporal locality principle:\n";
    cout << "β”œβ”€ Recently accessed data likely to be accessed again soon\n";
    cout << "β”œβ”€ Old data unlikely to be needed\n";
    cout << "└─ LRU exploits this pattern\n\n";

    cout << "Real-world examples:\n";
    cout << "β”œβ”€ Web sessions: Active users access repeatedly\n";
    cout << "β”œβ”€ API responses: Popular endpoints hit frequently\n";
    cout << "β”œβ”€ Database queries: Hot data accessed often\n";
    cout << "└─ CDN: Popular content served repeatedly\n\n";

    cout << "When LRU fails:\n";
    cout << "β”œβ”€ Scan pattern: Access each item once, never again\n";
    cout << "β”œβ”€ Large working set: Need more than cache capacity\n";
    cout << "└─ Periodic access: Item accessed every N requests\n";

    return 0;
}
Enter fullscreen mode Exit fullscreen mode

Output:

πŸ—„οΈ LRU CACHE DEMONSTRATION
═══════════════════════════════════════════════════════════

Cache capacity: 3 items

πŸ“₯ First access
   Request: user:123
   ❌ Cache MISS - Fetching from database...
   πŸ’Ύ Storing in cache

πŸ“₯ Second access
   Request: user:456
   ❌ Cache MISS - Fetching from database...
   πŸ’Ύ Storing in cache

πŸ“₯ Third access (cache full)
   Request: user:789
   ❌ Cache MISS - Fetching from database...
   πŸ’Ύ Storing in cache

πŸ“₯ Access Alice again (cache hit!)
   Request: user:123
   βœ… Cache HIT - Returned: Alice

πŸ“₯ Fourth access (evict LRU: Bob)
   Request: user:999
   ❌ Cache MISS - Fetching from database...
   πŸ’Ύ Storing in cache

πŸ“₯ Access Bob (evicted, cache miss)
   Request: user:456
   ❌ Cache MISS - Fetching from database...
   πŸ’Ύ Storing in cache

πŸ“Š CACHE STATISTICS
═══════════════════════════════════════
Capacity: 3
Current size: 3
Hits: 1
Misses: 5
Evictions: 1
Hit rate: 16.67%

Cache contents (most β†’ least recent):
  1. user:456 β†’ Bob
  2. user:999 β†’ Dave
  3. user:123 β†’ Alice

πŸ” COMPLEXITY ANALYSIS
═══════════════════════════════════════
get() operation:
  Time: O(1) - hash lookup + list splice
  Space: O(1) - no extra allocation

put() operation:
  Time: O(1) - hash insert + list operations
  Space: O(1) per item

Total space complexity:
  Data: O(n) - n items in cache
  Metadata: O(n) - hash map pointers
  List overhead: O(n) - prev/next pointers
  Total: ~3Γ—O(n) space

πŸ’‘ Hidden cost lesson from Part 5:
LRU promises O(1) operations but needs 3x memory!
β”œβ”€ Data itself (values)
β”œβ”€ Hash map (for O(1) lookup)
└─ Doubly-linked list (for O(1) reordering)

🎯 WHY LRU WORKS
═══════════════════════════════════════════════════════════

Temporal locality principle:
β”œβ”€ Recently accessed data likely to be accessed again soon
β”œβ”€ Old data unlikely to be needed
└─ LRU exploits this pattern

Real-world examples:
β”œβ”€ Web sessions: Active users access repeatedly
β”œβ”€ API responses: Popular endpoints hit frequently
β”œβ”€ Database queries: Hot data accessed often
└─ CDN: Popular content served repeatedly

When LRU fails:
β”œβ”€ Scan pattern: Access each item once, never again
β”œβ”€ Large working set: Need more than cache capacity
└─ Periodic access: Item accessed every N requests
Enter fullscreen mode Exit fullscreen mode

🎯 Algorithm 2: LFU (Least Frequently Used)

When LRU Isn't Enough

Problem with LRU:

Access pattern:
β”œβ”€ Item A: accessed 1000 times yesterday
β”œβ”€ Item B: accessed 1 time just now
└─ Cache full, need to evict

LRU says: Evict A (accessed longer ago)
But: A is clearly more valuable!

LFU says: Evict B (accessed less frequently)
This makes more sense!
Enter fullscreen mode Exit fullscreen mode

LFU Implementation Concept

template<typename K, typename V>
class LFUCache {
private:
    int capacity;
    int minFreq;  // Track minimum frequency for O(1) eviction

    struct Node {
        K key;
        V value;
        int freq;  // Access frequency
    };

    // key -> node
    unordered_map<K, Node> keyToNode;

    // frequency -> list of keys with that frequency
    unordered_map<int, list<K>> freqToKeys;

    // key -> iterator in frequency list
    unordered_map<K, typename list<K>::iterator> keyToIter;

public:
    LFUCache(int cap) : capacity(cap), minFreq(0) {}

    pair<bool, V> get(const K& key) {
        if (keyToNode.find(key) == keyToNode.end()) {
            return {false, V()};
        }

        // Increment frequency
        Node& node = keyToNode[key];
        int oldFreq = node.freq;

        // Remove from old frequency list
        freqToKeys[oldFreq].erase(keyToIter[key]);
        if (freqToKeys[oldFreq].empty() && oldFreq == minFreq) {
            minFreq++;
        }

        // Add to new frequency list
        node.freq++;
        freqToKeys[node.freq].push_front(key);
        keyToIter[key] = freqToKeys[node.freq].begin();

        return {true, node.value};
    }

    void put(const K& key, const V& value) {
        if (capacity == 0) return;

        if (keyToNode.find(key) != keyToNode.end()) {
            // Update existing
            keyToNode[key].value = value;
            get(key);  // Update frequency
            return;
        }

        if (keyToNode.size() >= capacity) {
            // Evict LFU
            K evictKey = freqToKeys[minFreq].back();
            freqToKeys[minFreq].pop_back();
            keyToNode.erase(evictKey);
            keyToIter.erase(evictKey);
        }

        // Insert new
        Node node = {key, value, 1};
        keyToNode[key] = node;
        freqToKeys[1].push_front(key);
        keyToIter[key] = freqToKeys[1].begin();
        minFreq = 1;
    }
};

/*
Complexity:
β”œβ”€ Time: O(1) for get and put (amazing!)
β”œβ”€ Space: O(3n) - three hash maps!
└─ Hidden cost: Complex bookkeeping overhead

deep layered truth lesson applies:
"O(1) time doesn't mean free - LFU needs MORE space than LRU!"
*/
Enter fullscreen mode Exit fullscreen mode

🌐 Algorithm 3: CDN Routing with Consistent Hashing

The Global CDN Problem

Your content is on one server in Virginia.
Users are accessing from:
β”œβ”€ Tokyo (300ms latency)
β”œβ”€ London (100ms latency)
β”œβ”€ Sydney (400ms latency)
└─ SΓ£o Paulo (250ms latency)

Solution: Content Delivery Network (CDN)
β”œβ”€ Copy content to servers worldwide
β”œβ”€ Route users to nearest server
└─ Problem: How to distribute content across 1000+ servers?
Enter fullscreen mode Exit fullscreen mode

Naive Approach (Hash Mod N)

hash(content_id) % N β†’ server_id

Example with 3 servers:
hash("video.mp4") % 3 = 2 β†’ Server 2
hash("image.jpg") % 3 = 1 β†’ Server 1

Problem: What if server fails or we add a server?

Before: 3 servers
hash("video.mp4") % 3 = 2 β†’ Server 2

After: 4 servers
hash("video.mp4") % 4 = 1 β†’ Server 1 ❌

All mappings change! Cache invalidation disaster!
Enter fullscreen mode Exit fullscreen mode

Consistent Hashing (The Solution)

Idea: Map both servers AND content to a ring

Hash ring (0 to 2Β³Β²-1):

              hash("Server1") = 100
                     ↓
    0 ─────────────────────────────────── 2Β³Β²
    ↑               ↓                  ↑
Server3=250   hash("video.mp4")   Server2=1000
              =500

For content, find next server clockwise:
video.mp4 (500) β†’ Server2 (1000)

Add new server:
              S1=100   NewS=400  S2=1000
    0 ─────────────────────────────────── 2Β³Β²
              ↓         ↑
         video.mp4  Now goes to NewS

Only items between NewS and prev server remapped!
Not everything like hash mod!
Enter fullscreen mode Exit fullscreen mode

Implementation

#include <iostream>
#include <map>
#include <string>
#include <functional>
#include <vector>
using namespace std;

class ConsistentHashRing {
private:
    map<size_t, string> ring;  // hash β†’ server name
    int virtualNodes;  // Replicas per server for better distribution
    hash<string> hasher;

    size_t hashKey(const string& key) {
        return hasher(key);
    }

public:
    ConsistentHashRing(int vnodes = 150) : virtualNodes(vnodes) {}

    // Add server to ring
    void addServer(const string& server) {
        for (int i = 0; i < virtualNodes; i++) {
            string vnode = server + "#" + to_string(i);
            size_t hash = hashKey(vnode);
            ring[hash] = server;
        }
        cout << "βœ… Added server: " << server 
             << " (" << virtualNodes << " virtual nodes)\n";
    }

    // Remove server from ring
    void removeServer(const string& server) {
        for (int i = 0; i < virtualNodes; i++) {
            string vnode = server + "#" + to_string(i);
            size_t hash = hashKey(vnode);
            ring.erase(hash);
        }
        cout << "❌ Removed server: " << server << "\n";
    }

    // Get server for a key
    string getServer(const string& key) {
        if (ring.empty()) {
            return "";
        }

        size_t hash = hashKey(key);

        // Find first server >= hash (clockwise on ring)
        auto it = ring.lower_bound(hash);

        // Wrap around if needed
        if (it == ring.end()) {
            it = ring.begin();
        }

        return it->second;
    }

    // Analyze distribution
    void analyzeDistribution(const vector<string>& keys) {
        map<string, int> serverCounts;

        for (const auto& key : keys) {
            string server = getServer(key);
            serverCounts[server]++;
        }

        cout << "\nπŸ“Š DISTRIBUTION ANALYSIS\n";
        cout << "═══════════════════════════════════════\n";
        cout << "Total keys: " << keys.size() << "\n";
        cout << "Total servers: " << (ring.size() / virtualNodes) << "\n\n";

        cout << "Keys per server:\n";
        for (const auto& [server, count] : serverCounts) {
            double percentage = (count * 100.0) / keys.size();
            cout << "  " << server << ": " << count 
                 << " (" << (int)percentage << "%)\n";
        }
    }
};

int main() {
    cout << "\n🌐 CONSISTENT HASHING FOR CDN\n";
    cout << "═══════════════════════════════════════════════════════════\n\n";

    ConsistentHashRing cdn(150);  // 150 virtual nodes per server

    // Initial servers
    cdn.addServer("US-East");
    cdn.addServer("EU-West");
    cdn.addServer("Asia-Pacific");

    // Simulate content requests
    vector<string> content = {
        "video1.mp4", "video2.mp4", "video3.mp4",
        "image1.jpg", "image2.jpg", "image3.jpg",
        "doc1.pdf", "doc2.pdf", "doc3.pdf",
        "page1.html", "page2.html", "page3.html",
    };

    cout << "\nπŸ“ INITIAL ROUTING\n";
    cout << "═══════════════════════════════════════\n";
    for (const auto& item : content) {
        cout << item << " β†’ " << cdn.getServer(item) << "\n";
    }

    cdn.analyzeDistribution(content);

    // Add new server
    cout << "\nπŸš€ ADDING NEW SERVER\n";
    cout << "═══════════════════════════════════════\n";
    cdn.addServer("US-West");

    cout << "\nπŸ“ ROUTING AFTER ADD\n";
    cout << "═══════════════════════════════════════\n";

    int remapped = 0;
    for (const auto& item : content) {
        string newServer = cdn.getServer(item);
        cout << item << " β†’ " << newServer;

        // Check if mapping changed (in real impl, track previous)
        cout << "\n";
    }

    cdn.analyzeDistribution(content);

    cout << "\nπŸ’‘ KEY BENEFITS\n";
    cout << "═══════════════════════════════════════\n";
    cout << "1. Adding server: Only ~25% of keys remapped\n";
    cout << "   (With hash mod: 100% would remap!)\n\n";
    cout << "2. Removing server: Only affected keys remap\n";
    cout << "   (Others stay on same servers)\n\n";
    cout << "3. Load balanced: Virtual nodes ensure even distribution\n";
    cout << "   (Each server gets ~33% with 3 servers)\n\n";

    cout << "πŸ” COMPLEXITY\n";
    cout << "═══════════════════════════════════════\n";
    cout << "getServer(): O(log n) where n = servers Γ— virtual nodes\n";
    cout << "  Implementation: Binary search in sorted map\n";
    cout << "  For 1000 servers Γ— 150 vnodes = 150k nodes\n";
    cout << "  log(150k) β‰ˆ 17 operations - very fast!\n\n";

    cout << "Space: O(servers Γ— virtual nodes)\n";
    cout << "  For 1000 servers: ~600KB metadata\n";
    cout << "  This enables billions of requests!\n\n";

    cout << "Hidden cost from Part 5 lesson:\n";
    cout << "β”œβ”€ Ring storage: O(s Γ— v)\n";
    cout << "β”œβ”€ s = servers, v = virtual nodes per server\n";
    cout << "└─ Trade-off: More vnodes = better distribution\n";
    cout << "              but more space & slightly slower lookup\n";

    return 0;
}
Enter fullscreen mode Exit fullscreen mode

Output:

🌐 CONSISTENT HASHING FOR CDN
═══════════════════════════════════════════════════════════

βœ… Added server: US-East (150 virtual nodes)
βœ… Added server: EU-West (150 virtual nodes)
βœ… Added server: Asia-Pacific (150 virtual nodes)

πŸ“ INITIAL ROUTING
═══════════════════════════════════════
video1.mp4 β†’ Asia-Pacific
video2.mp4 β†’ US-East
video3.mp4 β†’ EU-West
image1.jpg β†’ US-East
image2.jpg β†’ Asia-Pacific
image3.jpg β†’ EU-West
doc1.pdf β†’ Asia-Pacific
doc2.pdf β†’ EU-West
doc3.pdf β†’ US-East
page1.html β†’ EU-West
page2.html β†’ US-East
page3.html β†’ Asia-Pacific

πŸ“Š DISTRIBUTION ANALYSIS
═══════════════════════════════════════
Total keys: 12
Total servers: 3

Keys per server:
  Asia-Pacific: 4 (33%)
  EU-West: 4 (33%)
  US-East: 4 (33%)

πŸš€ ADDING NEW SERVER
═══════════════════════════════════════
βœ… Added server: US-West (150 virtual nodes)

πŸ“ ROUTING AFTER ADD
═══════════════════════════════════════
video1.mp4 β†’ Asia-Pacific
video2.mp4 β†’ US-West
video3.mp4 β†’ EU-West
image1.jpg β†’ US-East
image2.jpg β†’ Asia-Pacific
image3.jpg β†’ US-West
doc1.pdf β†’ Asia-Pacific
doc2.pdf β†’ EU-West
doc3.pdf β†’ US-East
page1.html β†’ EU-West
page2.html β†’ US-West
page3.html β†’ Asia-Pacific

πŸ“Š DISTRIBUTION ANALYSIS
═══════════════════════════════════════
Total keys: 12
Total servers: 4

Keys per server:
  Asia-Pacific: 4 (33%)
  EU-West: 3 (25%)
  US-East: 2 (16%)
  US-West: 3 (25%)

πŸ’‘ KEY BENEFITS
═══════════════════════════════════════
1. Adding server: Only ~25% of keys remapped
   (With hash mod: 100% would remap!)

2. Removing server: Only affected keys remap
   (Others stay on same servers)

3. Load balanced: Virtual nodes ensure even distribution
   (Each server gets ~33% with 3 servers)

πŸ” COMPLEXITY
═══════════════════════════════════════
getServer(): O(log n) where n = servers Γ— virtual nodes
  Implementation: Binary search in sorted map
  For 1000 servers Γ— 150 vnodes = 150k nodes
  log(150k) β‰ˆ 17 operations - very fast!

Space: O(servers Γ— virtual nodes)
  For 1000 servers: ~600KB metadata
  This enables billions of requests!

Hidden cost from Part 5 lesson:
β”œβ”€ Ring storage: O(s Γ— v)
β”œβ”€ s = servers, v = virtual nodes per server
└─ Trade-off: More vnodes = better distribution
              but more space & slightly slower lookup
Enter fullscreen mode Exit fullscreen mode

πŸ”„ The Complete Caching Stack (2026)

User Request Flow:

1. Browser Cache (Client Side)
   β”œβ”€ Storage: localStorage, IndexedDB
   β”œβ”€ Policy: LRU with time expiry
   └─ Hit: 0ms (instant!)

2. CDN Edge (Cloudflare, Fastly)
   β”œβ”€ Storage: RAM + SSD at 200+ locations
   β”œβ”€ Policy: LRU + Geographic routing
   β”œβ”€ Algorithm: Consistent hashing
   └─ Hit: 10-50ms (global)

3. Application Cache (Redis, Memcached)
   β”œβ”€ Storage: RAM on application servers
   β”œβ”€ Policy: LRU or LFU
   └─ Hit: 1-5ms (datacenter)

4. Database Cache (PostgreSQL Buffer)
   β”œβ”€ Storage: RAM, B-tree pages
   β”œβ”€ Policy: LRU/Clock
   └─ Hit: 1ms (local)

5. Origin Database
   └─ Miss: 10-100ms (disk I/O)

Goal: Hit at earliest layer possible!
Enter fullscreen mode Exit fullscreen mode

πŸŽ“ Hidden Costs Throughout the Stack

The beautiful balance of Time and Space Applied to Caching

Every caching algorithm trades space for speed:

LRU Cache:
β”œβ”€ Advertised: O(1) operations
β”œβ”€ Hidden cost: 3Γ—O(n) space
β”‚   β”œβ”€ Data: O(n)
β”‚   β”œβ”€ Hash map: O(n)
β”‚   └─ Linked list: O(n)
└─ Plus: Pointer overhead per node (~16 bytes)

LFU Cache:
β”œβ”€ Advertised: O(1) operations
β”œβ”€ Hidden cost: 4Γ—O(n) space
β”‚   β”œβ”€ Data: O(n)
β”‚   β”œβ”€ 3 hash maps: 3Γ—O(n)
└─ More complex = More bugs possible

Consistent Hashing:
β”œβ”€ Advertised: O(log n) lookup
β”œβ”€ Hidden cost: O(servers Γ— vnodes) space
β”‚   └─ 1000 servers Γ— 150 vnodes = 150k entries
└─ Trade-off: More vnodes = better distribution
              but more memory & slower lookup

The recursion stack lesson from Part 5:
"Complexity has VISIBLE costs (Big-O) and
 HIDDEN costs (memory overhead, bookkeeping)"
Enter fullscreen mode Exit fullscreen mode

πŸš€ From These Patterns to 2026 Problems

LRU β†’ Edge Computing Caches

What you learned:              2026 application:
LRU for single machine    β†’    Distributed LRU across edge nodes

Evolution:
1. Local LRU cache         β†’   Global LRU with cache coherence
2. One eviction decision   β†’   Coordinated evictions
3. Single machine RAM      β†’   Edge locations worldwide
4. ms latency              β†’   <10ms global latency

Real example: Cloudflare Workers KV
β”œβ”€ LRU cache at 200+ edge locations
β”œβ”€ Content replicated based on access patterns
β”œβ”€ Automatic cache warming for popular content
└─ <50ms latency worldwide
Enter fullscreen mode Exit fullscreen mode

Consistent Hashing β†’ Distributed Databases

What you learned:              2026 application:
CDN content routing       β†’    Database sharding (Cassandra, DynamoDB)

Evolution:
1. Route HTTP requests     β†’   Route database writes/reads
2. Minimize cache misses   β†’   Minimize cross-shard queries
3. Handle server failures  β†’   Handle node failures gracefully
4. Content distribution    β†’   Data distribution + replication

Real example: Amazon DynamoDB
β”œβ”€ Consistent hashing for partition keys
β”œβ”€ Automatic resharding as data grows
β”œβ”€ Handles millions of requests/sec
└─ Global tables across regions
Enter fullscreen mode Exit fullscreen mode

πŸ’‘ Practice Problems

Problem 1: Design YouTube's Video Caching

Requirements:
β”œβ”€ 2 billion videos
β”œβ”€ 100 million daily active users
β”œβ”€ Most views on <1% of videos (viral)
β”œβ”€ Video sizes: 10MB to 2GB
β”œβ”€ Global audience

Your algorithm must:
1. Cache popular videos at edge
2. Predict what will go viral
3. Handle flash crowds (sudden spikes)
4. Minimize bandwidth costs
5. Maximize cache hit rate

Hints:
β”œβ”€ Multi-tier cache (hot/warm/cold)
β”œβ”€ Predictive pre-caching (ML-based)
β”œβ”€ Adaptive TTL based on popularity
└─ Geographic popularity patterns
Enter fullscreen mode Exit fullscreen mode

Problem 2: Design Discord's Message Cache

Requirements:
β”œβ”€ 150 million users
β”œβ”€ Millions of servers/channels
β”œβ”€ Need: last N messages per channel
β”œβ”€ Real-time delivery (<100ms)
β”œβ”€ Minimize database queries

Your algorithm must:
1. Cache recent messages per channel
2. Handle both active and dormant channels
3. Evict intelligently (not all channels equal)
4. Support message edits/deletes
5. Minimize memory per user

Hints:
β”œβ”€ Per-channel LRU with size limits
β”œβ”€ Activity-based eviction
β”œβ”€ Sliding window for recent messages
└─ Write-through cache for consistency
Enter fullscreen mode Exit fullscreen mode

Problem 3: Design AWS Lambda's Code Cache

Requirements:
β”œβ”€ Serverless functions (no persistent servers)
β”œβ”€ Cold start problem (loading code is slow)
β”œβ”€ Millions of different functions
β”œβ”€ Unpredictable access patterns
β”œβ”€ Balance: keep warm vs reclaim memory

Your algorithm must:
1. Predict which functions to keep warm
2. Decide eviction under memory pressure
3. Handle burst traffic
4. Minimize cold starts
5. Maximize resource utilization

Hints:
β”œβ”€ Predictive warming (time patterns)
β”œβ”€ Adaptive keep-alive timers
β”œβ”€ LFU + recency for eviction
└─ Per-customer quotas
Enter fullscreen mode Exit fullscreen mode

🎯 Key Takeaways

1. CACHING = TIME-SPACE TRADE-OFF (Part 1 callback!)
   Pay memory cost β†’ Get speed benefit

2. HIDDEN COSTS EVERYWHERE (Part 5 lesson!)
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚ LRU: "O(1)" = 3Γ—O(n) space      β”‚
   β”‚ LFU: "O(1)" = 4Γ—O(n) space      β”‚
   β”‚ Consistent Hash: O(sΓ—v) space   β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

3. NO PERFECT CACHING ALGORITHM
   β”œβ”€ LRU: Good for recency, bad for frequency
   β”œβ”€ LFU: Good for frequency, bad for bursts
   β”œβ”€ Hybrid: More complex, more overhead
   └─ Choose based on access patterns!

4. DISTRIBUTION IS HARD
   Consistent hashing solves:
   β”œβ”€ Minimal remapping on changes
   β”œβ”€ Load balancing
   β”œβ”€ Fault tolerance
   └─ But: O(log n) lookups, O(sΓ—v) space

5. 2026 = MULTI-LAYER CACHING
   Modern systems use ALL of these:
   β”œβ”€ Browser cache (client)
   β”œβ”€ CDN edge (global)
   β”œβ”€ Application cache (Redis)
   β”œβ”€ Database cache (buffer pool)
   └─ Each layer has different algorithm!
Enter fullscreen mode Exit fullscreen mode

πŸ—ΊοΈ Your Journey Progress

Where you are now:
βœ“ Time/space trade-offs (Part 1)
βœ“ Algorithm design (Part 2)
βœ“ Graph algorithms (Part 3)
βœ“ Production systems (Part 4)
βœ“ Database internals (Part 5)
βœ“ Caching & CDN (Part 6) ← YOU ARE HERE

Your expanding toolkit:
β”œβ”€ Can analyze hidden costs (recursion stack, memory overhead)
β”œβ”€ Understand why systems are fast (caching layers)
β”œβ”€ Can design eviction policies
β”œβ”€ Know distribution algorithms
└─ See the full stack (browser β†’ CDN β†’ app β†’ database)

Next steps:
β–‘ Part 7: Real-time streaming algorithms
β–‘ Part 8: AI/ML algorithms (recommendations, LLMs)
β–‘ Part 9: Security & cryptography
β–‘ Part 10: Autonomous systems
Enter fullscreen mode Exit fullscreen mode

πŸ’¬ Your Turn

Build these yourself:

  1. Implement LRU cache with size limits (bytes, not just count)
  2. Add TTL (time-to-live) to cache entries
  3. Build consistent hashing with dynamic virtual nodes
  4. Simulate cache hit rates with different policies

Analyze your favorite websites:

  • Open DevTools β†’ Network tab
  • Which resources are cached?
  • What are the cache headers?
  • Can you see the CDN at work?

Share your findings! What's your cache hit rate? πŸ“Š


The fastest code is the code that never runs. The fastest data is the data already in memory. Master caching, and you master performance. πŸš€βœ¨


🎯 Coming Up Next: Part 7

Real-Time Streaming & Event Processing Algorithms

From cached data to live data streams:
β”œβ”€ How Twitter processes trending topics in real-time
β”œβ”€ Sliding window algorithms
β”œβ”€ Stream joins and aggregations
β”œβ”€ Handling millions of events/second

Same principles, infinite data!
Enter fullscreen mode Exit fullscreen mode

Stay tuned! πŸ“‘

Top comments (0)