DEV Community

Cover image for System Design Learning Journey — Week 3: Caching
Majd-sufyan
Majd-sufyan

Posted on

System Design Learning Journey — Week 3: Caching

In Week 3, I focused on one of the most important concepts in scalable backend systems:

Avoid hitting the database unnecessarily.

After learning how databases store and retrieve data in Week 2, the next logical question became:

  • What happens when traffic grows?
  • What if millions of users request the same data repeatedly?
  • How do systems remain fast under heavy read load?

This week was all about caching.

The main goals were to:

  • Understand why caching exists
  • Learn common cache strategies
  • Explore cache invalidation trade-offs
  • Apply these concepts by designing a high-traffic product catalog system

Compared to previous weeks, this felt like the point where system design started becoming much closer to real-world backend engineering.

*Why Caching Exists *
Databases are powerful, but they are also expensive resources.

Even optimized relational databases eventually become bottlenecks under massive read traffic.

Imagine a product page on an e-commerce platform:

  • A product may be viewed millions of times
  • But the actual product data changes very rarely

Without caching:
Client → API → Database
for every request.

With caching:
Client → API → Cache → Response
The database is queried once, and the cache serves the remaining requests.

This dramatically reduces:

  • Latency
  • Database load
  • Infrastructure costs

Key Insight: Caching Improves Performance but Adds Complexity

One of the biggest lessons this week was:

Cache is easy to add, but difficult to keep correct.

Caching introduces a new challenge:

  • What happens when cached data becomes outdated?

This is the famous problem of cache invalidation.


Cache Invalidation: The Hard Problem

Example:

A product is cached with:

Price = $999

An admin later updates the database:

Price = $899

But the cache still returns:

$999

This creates stale data.

The core challenge becomes:

How do we keep cache and database synchronized without hurting performance?


Cache-Aside Pattern

The main strategy I focused on was the cache-aside pattern.

Flow:

1. Request arrives
2. Check cache

IF HIT:
    return cached data

IF MISS:
    query database
    store result in cache
    return response

Enter fullscreen mode Exit fullscreen mode

This approach is:

  • Simple
  • Common in production systems
  • Easy to implement with Redis

It also reinforced another important principle:

Cache should be treated as an optimization layer, not the source of truth.


Applying the Concepts: High-Traffic Product Catalog

To apply these ideas, I designed a simplified high-traffic product catalog system similar to what platforms like Amazon or Shopify might use.


Functional Requirements

  • View products
  • Search products
  • Update product details
  • View featured / trending products

Non-Functional Requirements

  • Very low latency
  • High read throughput
  • High availability
  • Scalability under heavy traffic
  • Fault tolerance

High-Level Architecture

Figure 1: Product catalog architecture with Redis caching layer.

The system consists of:

  • Stateless API servers behind a load balancer
  • PostgreSQL as the primary data store
  • Redis as the caching layer
  • Cache-aside strategy for reads

The majority of requests are served directly from Redis, dramatically reducing pressure on the database.


What Should Be Cached?

This week also helped me think more carefully about what should actually be cached.

Good candidates for caching:

  • Product details
  • Product lists
  • Trending products
  • Frequently viewed items

Less ideal candidates:

  • Rapidly changing inventory
  • Real-time stock counts
  • Frequently updated pricing

This highlighted an important design trade-off:

The more dynamic the data, the harder caching becomes.


TTL (Time To Live)

Another concept I explored was TTL (Time To Live).

Instead of keeping cached data forever, cached entries automatically expire after a period of time.

Example:

Product cache expires after 5 minutes

Benefits:

  • Simpler invalidation strategy
  • Prevents very stale data
  • Reduces cache management complexity

Trade-off:

  • Slightly outdated data may temporarily exist

Failure Considerations

The system was designed so that:

  • If Redis fails → the database still serves requests
  • If API instances crash → traffic is routed elsewhere
  • If traffic spikes → stateless services scale horizontally

This reinforced a recurring system design principle:

Reliability often comes from graceful degradation.

Even if performance drops, the system should continue functioning correctly.


Implementation Notes

The product catalog system was implemented using:

  • Kotlin + Spring Boot
  • PostgreSQL
  • Redis
  • Cache-aside caching strategy

The focus this week was less about framework syntax and more about understanding:

  • access patterns
  • cache trade-offs
  • latency optimization
  • Key Takeaways

This week helped me understand that:

  • Databases alone are not enough at scale
  • Caching is fundamental for read-heavy systems
  • Cache invalidation is where complexity begins
  • Not all data should be cached equally
  • System design is mostly about trade-offs

What’s Next — Week 4

  • Replication & consistency
  • Primary databases vs read replicas
  • Designing systems that survive failures and replication lag The journey continues 🚀

Top comments (0)