DEV Community

Cover image for Caching Strategies (CDN, Redis, Memcached) in System Design
CodeWithDhanian
CodeWithDhanian

Posted on

Caching Strategies (CDN, Redis, Memcached) in System Design

In the complex world of system design, caching stands as one of the most powerful techniques to achieve high performance, low latency, and seamless scalability. Caching strategies involve intelligently storing frequently accessed data in fast-access layers, dramatically reducing the load on primary data sources such as databases or origin servers. This comprehensive exploration focuses on three cornerstone implementations: Content Delivery Networks (CDN) for global content distribution, Redis as a versatile in-memory data structure store, and Memcached as a lightweight distributed key-value cache. Each is examined with architectural depth, real-world technical structures, complete code implementations, and detailed explanations to ensure clarity for beginners and advanced practitioners alike.

The Core Principles of Caching in Distributed Systems

Caching operates on the principle of temporal locality, where recently or frequently requested data is kept in a high-speed storage medium. In a distributed system, caching mitigates bottlenecks by serving responses from memory rather than disk-based storage or remote services.

Two fundamental outcomes define every cache interaction: the cache hit and the cache miss. A cache hit occurs when the requested data exists in the cache, enabling sub-millisecond response times. A cache miss triggers a fetch from the slower backend source, followed by population of the cache for subsequent requests. Effective caching strategies balance hit ratios, eviction policies, and consistency guarantees to maximize efficiency.

Common caching patterns include cache-aside, where the application logic checks the cache before querying the database and writes results back on a miss; write-through, where every write updates both the cache and the persistent store synchronously; and write-behind, where writes are first committed to the cache and asynchronously propagated to the backend. These patterns form the foundation upon which CDN, Redis, and Memcached strategies are built.

Content Delivery Networks as a Global Caching Layer

Content Delivery Networks (CDN) represent a geographically distributed caching strategy designed to deliver static and dynamic content with minimal latency to users worldwide. Unlike traditional server-side caches, a CDN consists of hundreds or thousands of edge servers positioned in Points of Presence (PoPs) across continents, each maintaining local copies of content from a central origin server.

The CDN architecture begins with DNS resolution that routes user requests to the nearest edge server based on geographic proximity and network health. If the content resides in the edge cache, it is served instantly, achieving a cache hit. On a cache miss, the edge server fetches the resource from the origin server, caches it according to defined rules, and delivers it to the user. Subsequent requests from the same or nearby regions benefit from the cached copy.

Key technical components of a CDN include:

  • Origin server: The authoritative source holding the master data.
  • Edge servers: Specialized nodes optimized for high-throughput delivery with dedicated storage for cached objects.
  • Cache invalidation mechanisms: Time-to-live (TTL) settings, explicit purges via API calls, or versioned URLs to ensure freshness.
  • Load distribution and failover: Automatic routing away from unhealthy edge servers.

CDNs excel at caching immutable assets such as images, videos, CSS, and JavaScript, while modern implementations support dynamic content through edge-side includes or serverless functions that execute logic closer to the user. This reduces bandwidth costs, shields the origin server from traffic spikes, and improves overall system resilience.

Redis as a High-Performance, Feature-Rich Caching Solution

Redis functions as an in-memory data structure store that serves simultaneously as a cache, database, and message broker. Its caching strategy shines in scenarios requiring rich data types, persistence options, and advanced clustering for massive scale.

The internal Redis architecture relies on an event-driven, single-threaded (with multi-threading support in recent versions) design that achieves millions of operations per second. Data resides entirely in RAM for speed, with configurable persistence through RDB snapshots (periodic binary dumps) or AOF logs (append-only file for every write command). In production deployments, Redis operates in cluster mode, using 16384 hash slots distributed across nodes via consistent hashing for automatic sharding and replication.

Redis supports multiple eviction policies, including LRU (least recently used), LFU (least frequently used), and random, controlled by the maxmemory configuration directive. For caching strategies, developers typically implement cache-aside or write-through patterns, leveraging features like pipelining for batch operations and Lua scripting for atomic complex logic.

Here is a complete, production-ready Python implementation using the official redis-py client library to demonstrate a robust caching strategy in a web application context. This example shows a cache-aside pattern with TTL, error handling, and connection pooling:

import redis
import json
import time
from functools import wraps

# Establish a connection pool for high-concurrency environments
redis_pool = redis.ConnectionPool(
    host='localhost',
    port=6379,
    db=0,
    max_connections=100,
    decode_responses=True
)

redis_client = redis.Redis(connection_pool=redis_pool)

def cache_with_ttl(ttl_seconds=300):
    """Decorator implementing cache-aside pattern with TTL."""
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            # Generate a deterministic cache key based on function arguments
            cache_key = f"cache:{func.__name__}:{':'.join(map(str, args))}"

            # Attempt cache hit
            cached_data = redis_client.get(cache_key)
            if cached_data:
                return json.loads(cached_data)  # Cache hit - return immediately

            # Cache miss - execute original function
            result = func(*args, **kwargs)

            # Store in cache with expiration
            redis_client.setex(
                cache_key,
                ttl_seconds,
                json.dumps(result)
            )
            return result
        return wrapper
    return decorator

# Example usage in a service layer
@cache_with_ttl(ttl_seconds=600)
def get_user_profile(user_id):
    # Simulate database fetch (replace with actual ORM query)
    print(f"Fetching from database for user {user_id}")
    time.sleep(0.1)  # Simulate latency
    return {
        "user_id": user_id,
        "name": "John Doe",
        "email": "john@example.com",
        "preferences": {"theme": "dark"}
    }

# Usage
profile = get_user_profile(12345)
print(profile)
Enter fullscreen mode Exit fullscreen mode

This code establishes a connection pool for thread safety under load, generates composite cache keys to avoid collisions, and enforces automatic expiration. On subsequent calls, the cache hit bypasses the database entirely, reducing response time from hundreds of milliseconds to microseconds. In a full Redis cluster, the client library handles slot redirection transparently.

For high availability, configure Redis Sentinel for automatic failover or Redis Cluster with replica nodes. The structure ensures data remains consistent across shards while supporting horizontal scaling by adding nodes dynamically.

Memcached as a Lightweight, High-Speed Distributed Cache

Memcached is a simple yet extremely fast distributed memory caching system optimized for key-value storage. Its caching strategy prioritizes raw speed and simplicity, making it ideal for scenarios where only basic object storage is required without persistence or advanced data structures.

The Memcached architecture uses a multi-threaded server process with a slab allocator for efficient memory management. Memory is divided into fixed-size chunks (slabs) of varying sizes to minimize fragmentation. Keys are hashed using a consistent hashing variant across multiple Memcached instances, allowing seamless horizontal scaling by adding servers without rehashing all data.

Unlike Redis, Memcached is non-persistent by default and focuses exclusively on ephemeral caching. It supports only string values (though objects can be serialized), basic operations (get, set, delete, incr, decr), and an LRU eviction policy managed per slab. Memcached clusters are typically fronted by a client library that implements consistent hashing for distribution.

Below is a complete Python implementation using the pymemcache library, demonstrating a full cache-aside strategy with connection pooling and error resilience:

from pymemcache.client.hash import HashClient
import json
import time
from functools import wraps

# Configure a distributed Memcached cluster with consistent hashing
memcached_servers = [
    ('memcached1.example.com', 11211),
    ('memcached2.example.com', 11211),
    ('memcached3.example.com', 11211)
]

memcached_client = HashClient(
    memcached_servers,
    connect_timeout=1.0,
    timeout=1.0,
    no_delay=True,
    ignore_exc=True,  # Continue on individual server failures
    retry_attempts=2
)

def memcached_cache(ttl_seconds=300):
    """Decorator for Memcached cache-aside pattern."""
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            # Create composite key (Memcached keys must be under 250 bytes)
            cache_key = f"cache:{func.__name__}:{':'.join(map(str, args))}"[:250]

            # Attempt cache hit
            try:
                cached_data = memcached_client.get(cache_key)
                if cached_data:
                    return json.loads(cached_data.decode('utf-8'))  # Cache hit
            except Exception:
                pass  # Fall back to source on any cache error

            # Cache miss - fetch from source
            result = func(*args, **kwargs)

            # Store in cache with TTL
            try:
                memcached_client.set(
                    cache_key,
                    json.dumps(result).encode('utf-8'),
                    expire=ttl_seconds
                )
            except Exception:
                pass  # Non-blocking failure

            return result
        return wrapper
    return decorator

# Example usage
@memcached_cache(ttl_seconds=600)
def get_product_details(product_id):
    # Simulate database query
    print(f"Fetching product {product_id} from database")
    time.sleep(0.08)  # Simulate latency
    return {
        "product_id": product_id,
        "name": "Wireless Headphones",
        "price": 129.99,
        "stock": 450
    }

# Usage
details = get_product_details(9876)
print(details)
Enter fullscreen mode Exit fullscreen mode

This implementation distributes keys across the Memcached farm using HashClient, ensuring even load. The ignore_exc and retry logic provides resilience if individual nodes fail. Serialization with JSON allows storage of complex objects while keeping operations blazing fast. Scaling is achieved simply by adding more servers to the list; the client automatically rebalances.

Integrating Caching Strategies into System Architecture

When designing large-scale systems, CDN, Redis, and Memcached are often layered together. A typical flow routes static assets through a CDN for global delivery, while dynamic application data leverages Redis for complex queries and Memcached for high-volume, simple key-value lookups. Consistency is maintained through careful TTL tuning, background invalidation jobs, and occasional cache warming scripts that pre-populate popular keys during low-traffic periods.

Eviction policies, memory limits, and monitoring of cache hit ratios (ideally above 90 percent) are critical operational concerns. Tools like Prometheus can track metrics such as evictions, bytes used, and command latency across all three technologies.

This deep integration of caching strategies transforms systems from database-bound to cache-accelerated architectures capable of handling millions of requests per second with consistent sub-100-millisecond response times.

To help visualize the concepts discussed, here is a complete image:

System design caching strategies

System Design Handbook

For a comprehensive guide covering all aspects of system design including these advanced caching strategies and many more essential concepts, purchase the System Design Handbook at https://codewithdhanian.gumroad.com/l/ntmcf. This resource will provide you with the structured knowledge and practical examples needed to master professional system design.

Top comments (0)