Cache Strategies in Distributed Systems

#architecture #distributedsystems #performance #systemdesign

Caching is one of the most powerful and simplest ways to improve system performance. The most common approach is TTL (Time to Live), where the cache expires after a fixed time. But in the real world, this can cause traffic spikes.

So let’s understand the problem and ways to stop it:

Imagine the cache expires after 5 minutes.
10,000 users try to make a request at the same time the cache expires.
So all the requests go directly to the database, and the database slows down.

This is called the Thundering Herd problem, which can cause:

CPU spikes
Database connection spikes
Application blocks causing outages

Cache with TTL works well for simple systems with low traffic, but not for high-traffic systems. So let’s explore different approaches:

1) TTL Jitter (adding randomness)
Instead of setting TTL to exactly 60 minutes, it can be set to 60 + random(0,60).
So the cache expiry is distributed, which in turn reduces traffic spikes.

2) Mutex
When 1,000 users send requests and the cache expires, all requests go to the database.
Instead:
Only one request goes to the database while others wait. This reduces the load.

3) Stale-While-Revalidate
Instead of blocking users, serve old data and refresh the cache in the background.

4) Cache Pre-warming
Instead of waiting for traffic, load the cache before users arrive.

The above caching strategies are useful for: