丁久

Posted on May 18 • Originally published at dingjiu1989-hue.github.io

Caching Strategies and Patterns in Distributed Systems

#systemdesign #architecture

This article was originally published on AI Study Room. For the full version with working code examples and related articles, visit the original post.

Caching Strategies and Patterns in Distributed Systems

Caching is the single most effective performance optimization in distributed systems. A well-designed cache reduces database load, decreases response latency, and improves system throughput. This article covers the major caching patterns, eviction policies, distributed caching with Redis, CDN caching, and the hardest problem in computer science: cache invalidation.

Caching Patterns

Cache-Aside (Lazy Loading)

Cache-aside is the most common caching pattern. The application checks the cache first. On a cache miss, it reads from the database and populates the cache.

class CacheAside:

def init(self, cache, database):

self.cache = cache

self.database = database

def get_user(self, user_id):

1\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\. Try cache first

cached = self.cache.get(f"user:{user_id}")

if cached is not None:

return cached

2\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\. Cache miss: read from database

user = self.database.query("SELECT * FROM users WHERE id = ?", user_id)

if user:

3\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\. Populate cache for next time

self.cache.set(f"user:{user_id}", user, ttl=3600)

return user

def update_user(self, user_id, data):

1\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\. Update database

self.database.execute("UPDATE users SET name = ? WHERE id = ?",

data['name'], user_id)

2\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\. Invalidate cache (not update!)

self.cache.delete(f"user:{user_id}")

Advantages :

Only caches data that is actually requested (no wasted space).
Simple to implement and understand.
Cache failures are not fatal (system falls back to database).

Disadvantages :

Cache miss penalty includes both cache check and database read.
Stale data until TTL expires (if items are not invalidated on update).
Thundering herd problem on cache miss for popular items.

Write-Through

Write-through caches update the cache synchronously when data is written to the database.

class WriteThrough:

def init(self, cache, database):

self.cache = cache

self.database = database

def update_user(self, user_id, data):

1\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\. Update database

self.database.execute("UPDATE users SET name = ? WHERE id = ?",

data['name'], user_id)

2\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\. Update cache synchronously

user = self.database.query("SELECT * FROM users WHERE id = ?", user_id)

self.cache.set(f"user:{user_id}", user, ttl=3600)

Advantages :

Cache is always consistent with the database (no stale data).
No cache miss penalty for reads.
Read path is simple (always from cache or cache-miss-then-database).

Disadvantages :

Writes are slower (must update both database and cache).
Writes more data to cache than may ever be read (cache pollution).
Cache and database updates are not atomic (risk of inconsistency).

Write-Behind (Write-Back)

Write-behind caches write to the cache immediately and asynchronously update the database.

import asyncio

class WriteBehind:

def init(self, cache, database):

self.cache = cache

self.database = database

self.write_queue = asyncio.Queue()

self._start_flusher()

def _start_flusher(self):

"""Background task that flushes writes to database."""

async def flusher():

while True:

Batch writes and flush periodically

batch = []

for _ in range(100): # Batch size

try:

item = await asyncio.wait_for(

self.write_queue.get(), timeout=1.0

)

batch.append(item)

except asyncio.TimeoutError:

break

if batch:

self._flush_to_database(batch)

asyncio.cre

Read the full article on AI Study Room for complete code examples, comparison tables, and related resources.

Found this useful? Check out more developer guides and tool comparisons on AI Study Room.

DEV Community

Caching Strategies and Patterns in Distributed Systems

Caching Strategies and Patterns in Distributed Systems