Jessica Patel

Posted on Jan 28

Designing Cache Invalidation at Scale with Spring Boot, Redis, and AWS ElastiCache

#springboot #redis #awselasticache #java

TL;DR

How to design cache invalidation for multi‑region Spring Boot systems using Redis and AWS ElastiCache.
How to protect against cache stampedes with single‑flight, stale‑while‑revalidate, and probabilistic expiration.
How to use Redis Pub/Sub as a global invalidation bus and wire it into Spring Boot.

Who this is for: This article is for backend engineers running Spring Boot in production (often on AWS with Redis/ElastiCache) who are hitting scaling or consistency issues with naive caching.

Why Cache Invalidation Gets Hard at Scale

Cache invalidation is famously “one of the two hard things in computer science.” In a single‑node Spring Boot application, it is often treated as a solved problem: add @Cacheable, configure Redis, and move on. At scale, especially in multi‑region, high‑traffic systems, this approach breaks down quickly.

Caching improves latency and reduces database load, but it also introduces state duplication. Once data exists in multiple places—local memory, Redis, and multiple regions—keeping it consistent becomes non‑trivial.

Common failure modes include:

Stale reads after writes in another region
Cache stampedes overwhelming the database
Silent cache divergence between regions
“Fixes” involving global cache flushes that cause outages

The goal is not perfect consistency, but controlled, observable, and bounded inconsistency.

Reference Architecture: Multi-Region Spring Boot Caching

A typical large‑scale deployment looks like this:

Clients routed to the nearest region
Spring Boot services deployed per region
Each region has: local in‑memory cache (for example, Caffeine) and a regional Redis cluster (AWS ElastiCache)
A shared primary database (or active‑active replicas)

This creates three cache layers:

JVM‑local cache (fastest, most fragile)
Regional Redis cache
Source‑of‑truth database

Spring’s cache abstraction is unaware of regions, replication lag, or distributed invalidation. That logic must be designed explicitly.

Cache Invalidation Strategies That Actually Work

Before implementation, it is critical to choose the right strategy.

Cache-Aside (Recommended)

Application reads from cache
On miss, loads from DB and populates cache
On write, updates DB first, then invalidates cache

This provides clear control over invalidation and failure handling.

TTL-Based Expiration (Necessary but Insufficient)
TTL limits staleness but:

Does not prevent serving stale data immediately after writes
Can cause synchronized expirations (stampedes) under load

TTL must be combined with explicit invalidation. If you rely only on short TTLs for correctness, you are already at risk of cache stampedes.

Versioned Keys
Appending a version to cache keys allows mass invalidation without deletes. This works well for schema changes, but less so for fine‑grained updates.

Understanding Spring Boot Cache Internals

Spring Cache provides:

Method interception
Key generation
Cache abstraction over multiple providers

What it does not provide:

Cross‑instance invalidation
Distributed locking
Cache coherency across regions

Annotations like @CacheEvict only evict locally configured caches, not remote JVMs or regions.

Designing Cache Keys for Global Safety

Cache keys must be:

Deterministic
Namespaced
Versioned

A robust key structure looks like:
{service}:{entity}:{tenant}:{id}:v{schemaVersion}

Key versioning allows zero‑downtime changes and prevents collisions during deployments.

Multi-Region Consistency Models

Strong consistency across regions requires synchronous coordination, which increases latency and reduces availability. Most systems choose eventual consistency with guardrails:

Writes invalidate caches asynchronously
Reads may see stale data briefly
Business logic defines acceptable staleness windows

Trying to enforce global “read‑your‑writes” usually causes more harm than benefit for typical web workloads.

Cache Stampede: The Hidden Scaling Killer

A cache stampede occurs when:

A popular key expires or is invalidated
Thousands of concurrent requests miss the cache
All requests hit the database simultaneously

This can cascade into:

Database overload
Thread pool exhaustion
Region‑wide outages

TTL alone makes this problem worse by synchronizing expirations. For example, a 10k QPS endpoint with a 60‑second TTL can easily send thousands of requests to the database in a single second if a hot key expires everywhere at once.

Stampede Protection Techniques

These patterns are most useful on read‑heavy, hot paths. For low‑traffic or write‑heavy entities, they may be unnecessary complexity.

Probabilistic Early Expiration
Instead of expiring keys at a fixed time:

Add jitter to TTLs
Allow early refresh based on probability

This spreads refresh load over time. Avoid using complex probability logic on very low‑traffic keys; the added code rarely pays off there.

Request Coalescing (Single-Flight)
Only one request per key should rebuild the cache:

Use a Redis‑based lock per cache key
One instance becomes the “leader”
Others wait briefly or serve stale data

Locks must:

Have timeouts
Be fail‑safe
Never block indefinitely

This is powerful on extremely hot keys, but do not overuse per‑key locks on cold data—it adds operational complexity for little benefit.

Stale-While-Revalidate
Serve stale data while refreshing in the background:

Improves availability
Prevents user‑facing latency spikes
Requires explicit correctness checks

This pattern is extremely effective in read‑heavy systems where slightly stale data is acceptable, but it is a poor fit for financial or strongly consistent domains.

A Single-Flight Cache Load (Pseudo-code)

Below is pseudo‑code for a single‑flight cache load flow; treat it as a sketch, not drop‑in production code.

// Single-flight cache load for key (pseudo-code)
CacheValue getOrLoad(String key) {

    if (localCache.contains(key)) {
        return localCache.get(key);
    }

    // Try acquiring Redis-based lock for this key
    boolean lockAcquired = redisLock.tryLock(key, 5, TimeUnit.SECONDS);

    if (lockAcquired) {
        try {
            CacheValue value = db.load(key);      // Load from DB
            localCache.put(key, value);           // Populate local cache
            redisCache.put(key, value);           // Populate Redis
            return value;
        } finally {
            redisLock.unlock(key);
        }
    } else {
        // Fallback: serve stale cache if available, else wait briefly or fail
        return localCache.getOrDefault(key, fetchStaleOrFail(key));
    }
}

Failures should degrade gracefully to DB reads, with clear metrics so you can see when you are falling back too often.

Redis as a Global Invalidation Bus

Polling for invalidation does not scale. Instead, use event‑driven invalidation.

Redis Pub/Sub is well‑suited for this purpose:

Low latency
Simple semantics
Native support in ElastiCache

However, Redis Pub/Sub:

Does not guarantee delivery
Does not persist messages

This is acceptable for invalidation if:

TTLs exist as a safety net
Invalidation messages are idempotent

For stricter guarantees or auditability, you may need something like Redis Streams or Kafka instead of bare Pub/Sub.

Implementing Pub/Sub Invalidation with AWS ElastiCache

Channel Design
Use namespaced channels such as: cache-invalidation:{service}

Message Payload
Messages should be small and structured:

Cache key or key pattern
Entity type
Version
Timestamp Never include sensitive data.

ElastiCache Considerations

Pub/Sub works across nodes within a cluster
Cross‑region invalidation requires application‑level forwarding, or regional producers publishing to all regions
Avoid synchronous cross‑region calls on the write path

These considerations keep your writes fast while still achieving eventual consistency across regions.

Wiring Pub/Sub into Spring Boot

In practice you will define a CacheManager (for example, using Caffeine + Redis) and a dedicated RedisTemplate bean. The code below is illustrative and focuses on the invalidation flow rather than exact configuration.

@Service
@RequiredArgsConstructor
public class UserService {

    private final UserRepository userRepository;
    private final CacheManager cacheManager;
    private final RedisTemplate<String, String> redisTemplate;

    // Read-heavy operation, cache in both local and Redis layers
    @Cacheable(value = "users", key = "#id")
    public User getUser(Long id) {
        return userRepository.findById(id)
            .orElseThrow(() -> new EntityNotFoundException("User not found"));
    }

    // Write operation triggers explicit cache eviction
    @Transactional
    @CacheEvict(value = "users", key = "#user.id")
    public User updateUser(User user) {
        User updated = userRepository.save(user);

        // Publish invalidation message to Redis for multi-region propagation
        String channel = "cache-invalidation:users";
        String message = user.getId().toString();

        redisTemplate.convertAndSend(channel, message);

        return updated;
    }
}

@Component
@RequiredArgsConstructor
public class RedisCacheInvalidationListener {

    private final CacheManager cacheManager;

    // Illustrative annotation – configure according to your Redis listener setup
    @RedisListener(topic = "cache-invalidation:users")
    public void onMessage(String userId) {
        // Evict local caches (Caffeine or similar)
        cacheManager.getCache("users").evict(Long.valueOf(userId));

        // Optional: evict other Redis caches if needed
    }
}

Local caches (for example, Caffeine) must be explicitly cleared—Redis invalidation alone is insufficient.

End-to-End Invalidation Flow

Write Path

API updates database → transaction commits(t0)
Application evicts regional Redis cache immediately (t0+δ1)
Publish invalidation message to Redis Pub/Sub (t0+δ2)
All instances receive message and evict local caches asynchronously (t0+δ3 → t0+δ4)

Notes:

δ1 – δ4 represent small asynchronous delays; reads may see stale data briefly
Guarantees eventual consistency, not immediate consistency across regions

Read Path

Check local cache first
Check Redis cache if local miss
On miss: acquire single‑flight lock → load from DB → populate caches → release lock

Failures at any step degrade gracefully to DB reads.

Observability: Knowing When Caching Is Failing

Without observability, cache bugs remain invisible. Key metrics:

Cache hit ratio (local vs Redis)
Stampede lock contention rate
Invalidation propagation latency
DB fallback rate Logs should include:
Cache key
Region
Correlation ID

Distributed tracing can show sequences like “cache miss → DB spike → invalidation lag,” which helps you debug issues quickly.

Performance and Cost Trade-Offs

Caching is not free. Costs include:

Redis memory
Network traffic from Pub/Sub
Increased application complexity

Trade‑offs:

More aggressive caching reduces DB cost
Over‑caching increases Redis cost and invalidation load Optimize based on measured behavior, not assumptions.

Security and Safety Considerations

Protect your invalidation mechanism:

Restrict Redis access via security groups
Validate message payloads
Guard against wildcard evictions
Add feature flags to disable invalidation logic during incidents One malformed invalidation message can flush an entire region.

Common Anti-Patterns

These patterns create fragility rather than correctness:

Global cache flushes in production
Short TTLs used as a consistency crutch
Synchronous invalidation across regions
Assuming Redis Pub/Sub is reliable messaging

Naive Caching vs Designed Invalidation

Dimension	Naive caching at scale	Designed invalidation
Correctness	High risk of stale reads	Event-driven, eventual consistency enforced
Blast radius	Global cache flushes can wipe all regions	Targeted key eviction; limits impact
Operational risk	High: outages and DB overload possible	Controlled, observable, safe recovery
Cost	Low caching ops but high DB/incident cost	Slightly higher caching/invalidation ops, lower DB load
Complexity	Low to implement	Medium: needs locks, Pub/Sub, and monitoring

Designing invalidation intentionally turns caching from a constant source of outages into a predictable, observable subsystem.

Extending the Architecture

For stricter guarantees:

Combine Pub/Sub with versioned keys
Use Redis Streams or Kafka for durable invalidation
Add read fencing for critical entities

As systems evolve toward active‑active databases, invalidation becomes a first‑class architectural concern.

Conclusion

Cache invalidation at scale is not an annotation problem—it is a distributed systems problem. In multi‑region Spring Boot deployments, correctness emerges from explicit invalidation design, stampede protection, event‑driven coordination, and strong observability.

AWS ElastiCache and Redis Pub/Sub provide powerful building blocks, but only when used deliberately. A well‑designed cache invalidation strategy prevents outages, reduces costs, and enables systems to scale safely.

DEV Community