When building distributed systems, performance rarely comes for free. Every optimisation introduces a trade-off, and nowhere is this more visible than in caching and consistency.
Modern applications—social networks, e-commerce platforms, streaming services—rely heavily on caching to achieve low latency and massive scale. But once data is cached and replicated across systems, a fundamental question arises:
How consistent does the data need to be?
To answer that, we first need to understand caching strategies, and then explore how strong consistency and eventual consistency fit into real-world system design.
Why Caching Exists in the First Place
At scale, databases are expensive—both in latency and throughput.
Caching exists to:
- Reduce database load
- Improve response times
- Absorb traffic spikes
- Enable global scalability
A cache stores frequently accessed data closer to the application, often in-memory, making reads orders of magnitude faster than hitting a database.
But caching also introduces a challenge:
the cache and the database can temporarily disagree.
That disagreement is where consistency models come into play.
Common Caching Strategies
Caching is not a single technique—it’s a family of patterns, each with different trade-offs.
1. Cache-Aside (Lazy Loading)
This is the most widely used caching pattern.
How It Works
- Application checks the cache
- If data exists → return it
- If not → fetch from database and populate the cache
Pros
- Simple to implement
- Cache failures don’t break the system
- Database remains the source of truth
Cons
- Cache can serve stale data
- Cache misses cause latency spikes
- Writes require careful invalidation logic
This pattern naturally leads to eventual consistency, since cached data may lag behind the database.
2. Write-Through Cache
How It Works
- Every write goes to the cache first
- Cache synchronously writes to the database
Pros
- Cache always has fresh data
- Reads are fast and consistent
Cons
- Higher write latency
- Cache becomes a critical dependency
This pattern leans closer to strong consistency, especially for reads.
3. Write-Behind (Write-Back) Cache
How It Works
- Writes go only to the cache
- Database updates happen asynchronously
Pros
- Extremely fast writes
- High throughput
Cons
- Risk of data loss if cache fails
- Complex recovery logic
This pattern strongly favors performance over consistency and is commonly used when durability is less critical.
The Consistency Question
Once data is cached and replicated, systems must decide:
Should all users see the same data at the same time—or is “eventually correct” good enough?
This is where consistency models define system behavior.
Strong Consistency
What It Means
Strong consistency guarantees that:
- Every read returns the most recent write
- All users see the same data at the same time
From the user’s perspective, the system behaves like a single, perfectly synchronized machine.
How Strong Consistency Works
To achieve this, systems rely on:
- Synchronous replication
- Distributed locks or consensus protocols
- Quorum-based reads and writes
Writes only succeed once all required replicas confirm the update.
Pros
- Predictable behavior
- No stale reads
- Ideal for correctness-critical systems
Cons
- Higher latency
- Reduced availability during network issues
- Poor global scalability
When Strong Consistency Is the Right Choice
Strong consistency is essential when incorrect data is unacceptable:
- Banking transactions
- Payment systems
- Inventory reservation systems
- Distributed locks and leader election
In these systems, being slow is better than being wrong.
Eventual Consistency
What It Means
Eventual consistency guarantees that:
- All updates will propagate over time
- If no new writes occur, all replicas will eventually converge
Temporary inconsistencies are allowed.
How Eventual Consistency Works
Eventual consistency is typically achieved using:
- Asynchronous replication
- Time-based cache expiration (TTL)
- Background synchronization processes
Writes return immediately, and the system reconciles differences later.
Pros
- Low latency
- High availability
- Excellent global scalability
Cons
- Stale reads are possible
- More complex conflict resolution
- Harder to reason about system state
A Simple Example
- You like a post on a social app
- You immediately see the updated like count
- Someone in another region sees the old count briefly
- The system converges shortly after
Nothing broke—the system optimized for speed.
When Eventual Consistency Shines
Eventual consistency is ideal when:
- Minor staleness is acceptable
- Availability matters more than precision
- Systems operate across regions
Common use cases include:
- Social media feeds
- Content views and likes
- Product recommendations
- Analytics and metrics
- Caching layers
CAP Theorem: Why This Trade-off Is Inevitable
The CAP Theorem explains why systems must choose.
A distributed system can guarantee only two of:
- Consistency
- Availability
- Partition Tolerance
Since partitions are unavoidable, systems choose between:
- CP (Strong Consistency)
- AP (Eventual Consistency)
Caching-heavy systems almost always choose availability.
Conflict Resolution: The Hidden Cost of Eventual Consistency
When writes happen concurrently on different nodes, conflicts arise.
Common resolution strategies include:
Last-Write-Wins (LWW)
- Uses timestamps
- Simple but can lose updates
Vector Clocks
- Track causal history
- Detect and merge conflicts
- More complex but safer
Application-Level Merging
- Domain-specific logic
- Example: merging shopping carts instead of overwriting
Choosing the Right Approach
There is no “best” consistency model—only contextual decisions.
| Requirement | Best Fit |
|---|---|
| Financial accuracy | Strong Consistency |
| Global low-latency reads | Eventual Consistency |
| High write throughput | Eventual Consistency |
| Regulatory compliance | Strong Consistency |
| User-facing metrics | Eventual Consistency |
Many real-world systems are hybrid:
- Strong consistency for critical paths
- Eventual consistency for everything else
From Theory to Practice: Caching in Real Production Systems
So far, we’ve discussed why consistency trade-offs exist and how strong and eventual consistency work conceptually.
The real challenge begins when these ideas meet production traffic, global users, and failure scenarios.
In real systems, caching is not an optimization—it is a foundational architectural decision.
And with caching comes the unavoidable reality: data will be stale sometimes.
The key question is not how to avoid inconsistency, but:
Where can we tolerate it, and how do we control it?
The First Rule of Production Caching
Not all data deserves the same consistency guarantees.
Large-scale systems explicitly separate data into:
- Critical paths (money, inventory, correctness-sensitive state)
- Non-critical paths (feeds, counters, metadata, recommendations)
Each category gets a different caching and consistency strategy.
Production Architecture Patterns
1. Social Media Platforms (Likes, Feeds, Counters)
Typical stack
- Primary datastore (SQL / NoSQL)
- Distributed cache (Redis / Memcached)
- CDN for static and edge-cached content
- Asynchronous aggregation pipelines
How caching is used
- Likes, views, follower counts → aggressively cached
- Feeds → precomputed and cached per user
- Writes → fast, asynchronous, non-blocking
Consistency model
- Eventual consistency
- High availability is mandatory
- Small inconsistencies are acceptable
If a user briefly sees 1,001 likes instead of 1,002, the system is still correct from a product standpoint.
Here, availability and latency matter more than precision.
2. E-commerce Platforms: Catalog vs Checkout
E-commerce systems rarely choose a single consistency model—they are deliberately hybrid.
Product Catalog
- Cached heavily
- Served via cache + CDN
- TTL-based invalidation
Consistency: Eventual
Why: Small delays in prices or descriptions are acceptable.
Inventory & Checkout
- Minimal caching
- Strong consistency
- Database transactions or distributed locks
Consistency: Strong
Why: Overselling inventory is unacceptable.
Key insight:
The same system can—and should—use multiple consistency models based on risk.
3. Financial & Payment Systems
Caching exists around the core, not inside it.
Cached safely
- Read-only reference data (exchange rates, configs)
- User profile metadata
- Authentication/session tokens
Never cached
- Account balances
- Ledger entries
- Transaction state
Consistency model
- Strong consistency for money movement
- Eventual consistency for analytics, reporting, dashboards
Here, correctness always wins over speed.
4. Large-Scale APIs & Microservices
In microservice architectures, caching appears at multiple layers:
- Local (in-process)
- Regional (Redis clusters)
- Global (CDNs / edge caches)
A common pattern:
- Service A caches responses from Service B
- Cached data is versioned
- Invalidation happens via events
This introduces the hardest problem in distributed systems.
Cache Invalidation: Where Systems Actually Break
“There are only two hard things in Computer Science: cache invalidation and naming things.”
— Phil Karlton
Most caching failures in production are not due to cache misses—but due to incorrect invalidation logic.
Below are the strategies that actually survive at scale.
Cache Invalidation Strategies That Work
1. Time-Based Invalidation (TTL)
How it works
- Every cache entry has an expiration time
- After TTL, data is refreshed from the source
Pros
- Simple
- Predictable
- Failure-safe
Cons
- Data can be stale
- TTL tuning is heuristic-based
Use when
- Read-heavy workloads
- Non-critical correctness
- High fan-out caches
In production, TTL is mandatory, even when other strategies exist.
2. Explicit Invalidation on Writes
How it works
- Writes invalidate or update cache entries
- Next read fetches fresh data
Pros
- Fresher data
- Better correctness
Cons
- Complex dependency tracking
- Easy to miss edge cases
- Partial failures cause inconsistency
Rule:
Never rely on explicit invalidation alone—always combine with TTL.
3. Write-Through and Write-Behind Caches
Write-Through
- Write to cache and database synchronously Use when correctness matters and write volume is manageable.
Write-Behind
- Write to cache first, persist asynchronously Use when throughput matters more than durability.
These patterns are powerful—but risky—and are used sparingly in production.
4. Versioned Caching (Highly Recommended)
How it works
- Cache keys include a version
- Bumping the version invalidates all old entries implicitly
Example:
user_profile:v3:{user_id}
Pros
- No mass deletion
- Avoids race conditions
- Easy rollback
Cons
- Higher memory usage
- Requires version discipline
Large systems often prefer version bumps over explicit invalidation.
5. Event-Driven Invalidation
How it works
- Data changes emit events
- Consumers invalidate relevant cache entries
Pros
- Near real-time freshness
- Scales across services
Cons
- Event loss handling required
- Eventual consistency by design
- Debugging complexity
This is the backbone of modern event-driven architectures.
The Production Rulebook
Real-world systems follow these rules:
- TTL is non-negotiable
- Never rely on a single invalidation strategy
- Strong consistency is reserved for critical paths
- Eventual consistency dominates read paths
- Design for failure, not perfection
If your cache occasionally serves stale data but your system stays up, you’re doing it right.
Conclusion: Design for Reality, Not Perfection
Caching and consistency are not opposing ideas—they are complementary tools.
Strong consistency offers correctness.
Eventual consistency offers scale.
Great system design isn’t about choosing one—it’s about knowing where each belongs.
The fastest systems don’t eliminate trade-offs.
They embrace them intentionally.
Closing Thoughts
Caching is not just about performance—it defines how your system behaves under pressure.
Great systems:
- Cache aggressively
- Accept controlled inconsistency
- Apply strong consistency surgically
- Treat cache invalidation as a first-class concern
In distributed systems, correctness is contextual.
Availability is often the real product feature.
Top comments (0)