John Munn

Posted on Feb 5

The Scaling Gauntlet, Pt. 3: Cache Rules Everything Around Me

#performance #backend #javascript #webdev

It starts, like all false dawns, with good news.
Your load tests are green. Your connection count is stable.

Postgres Pete is calm.

The team celebrates. Someone makes a meme.

But something’s off.

Not “the app is down” bad. Just… squishy.

Latency spikes for logged-out users. Report generation still eats a chunk of the CPU.

And every time someone lands on the homepage, your servers run the same query for the 400,000th time this week.

You’re not slow anymore. You’re wasteful.

Welcome to Chapter 3: where your architecture is finally fast enough to reveal just how much duplicated work you’re doing.

War Story: The Leaderboard That Melted Everything

A few years back, we launched a gamified campaign. Leaderboard. Badges. Daily rewards.

The catch?

We recalculated the leaderboard on every request.

Every. Single. One.

Each hit triggered a massive sort, join, and filter operation across millions of rows. At launch, it didn’t matter—maybe 100 users an hour.

But when it went viral?

90,000 hits in a day.

Our DB didn’t crash.
It simmered. Slowly. Tragically.

Adding a 60-second Redis cache dropped response times from 912ms to 38ms.

Query load dropped by 99.7%.

Postgres Pete wrote us a thank-you note.

Chapter 3: Caching Isn’t Cheating

At small scale, caching is optional. At Large scale, it’s everything.

It’s how you survive.

The illusion is that if you make your database fast, and your queries efficient, you’re good.

But when 1,000 users hit the same route and you generate 1,000 identical responses—congratulations, you’ve optimized waste.

Caching is how you stop being a short-order cook

…and start being a chef with a prep station.

The Caching Stack

Let’s break it down by layer, not tool. You don’t need Redis just because someone on your team read a blog post. You need the right kind of cache for the right kind of job.

Page & Fragment Caching

When to use: Full-page responses that don’t change per user.
Where: WordPress, SSR frameworks, marketing pages, logged-out views.

CDN edge caching (Cloudflare, Fastly) for static assets
Static HTML snapshots
Component-level fragment caching (Next.js getStaticProps or getServerSideProps with revalidate)

Query Result Caching

When to use: Expensive queries that return predictable results.
Where: Reports, leaderboards, stats pages.

Cache result of a query in Redis for 30–300 seconds
Bust or update on key data changes
Use consistent, deterministic cache keys leaderboard:daily:2025-07-06

Object Caching

When to use: Frequently accessed entities that don’t change often
Where: User settings, pricing tables, content metadata

Load object into cache on first access
TTL + write-through or read-through patterns
Namespaced keys help avoid pollution: user:42:profile

Edge Caching & CDNs

When to use: Static assets, APIs with safe GETs, regional latency improvements
Where: Next.js, Shopify headless, any site that serves static content globally

GET /products?category=fitness? Cache it.
POST /checkout? Don’t.
Use surrogate keys to invalidate sets (product:updated → purge /products)

Redis Example: Cache-Aside Pattern

// Basic cache-aside pattern
const getCachedUser = async (userId) => {
  const cached = await redis.get(`user:${userId}`);
  if (cached) return JSON.parse(cached);

  const user = await db.users.findById(userId);
  await redis.setex(`user:${userId}`, 300, JSON.stringify(user));
  return user;
};

Concrete Example: Homepage Latency

Before:

Homepage load time: 680ms
90% spent on DB and API calls
Redis hit rate: 12%

After:

Homepage load time: 112ms
DB queries reduced by 87%
Redis hit rate: 89%

Caching just three components (featured products, blog teasers, and testimonials) made the biggest impact.

Cache Busting: The Forgotten Half of Caching

Writing to a cache is easy.
Invalidating it correctly is what separates grown-up systems from hopeful experiments.

               +------------------------+
               |     Does the data      |
               |     change often?      |
               +------------------------+
                          |
         +----------------+----------------+
         |                                 |
      Yes                               No
       |                                  |
+--------------+                +--------------------+
| Can you hook |                | Use long TTL with  |
| into writes? |                | fallback refresh   |
+------+-------+                +--------------------+
       |Yes                            
       |                              
+--------------+                      
| Event-driven |                     
| invalidation |                     
+--------------+                     
       |No                             
       |                              
+--------------+                     
| Use low TTL  |                     
| w/ polling   |                     
+--------------+

Cache Busting Methods:

Time-Based (TTL)

Easy to reason about
Accepts a bit of staleness
“Good enough” for dashboards, stats, pricing

Event-Driven

Invalidate cache when data updates
Use hooks in your ORM or pub/sub system
Harder to manage but more accurate

Example: You update a product’s price—trigger an event that clears product:123 in Redis and all related keys like category:fitness:featured.

// In your product update handler:
await redis.del(`product:${product.id}`);
await redis.del(`category:${product.category}:featured`);

Dependency-Tracking (Advanced)

Track what data powers what cache
Rebuild only what’s affected
Requires discipline and tooling (or you’ll hate yourself)

Signs Your Cache Is (or Isn’t) Working

Healthy Caching:

Cache hit rates >80% for hot paths
Time to first byte is low even under load
Redis/Memcached resource usage is predictable

Caching Gone Wrong:

Users seeing stale data or inconsistency
Hit rates low, invalidation too aggressive
Cache entries are massive, binary blobs
Conflicting caches (e.g. app and CDN fighting over freshness)

Cache Debugging: What to Watch

Redis/Memcached INFO STATS: Look for keyspace_hits vs keyspace_misses
Log slow/missed lookups: Tag routes where the cache failed silently
Cache stampedes: Detect when many users request the same uncached item at once; consider locking or stale-while-revalidate
Track TTL expirations: Are they aligning with actual usage patterns?

Advanced Patterns (For When You’re Ready)

Stale-While-Revalidate

Serve stale data immediately
In the background, fetch fresh data and replace it in the cache
Reduces wait time and perceived latency for users
Implementable via HTTP headers (Cache-Control: stale-while-revalidate=60) or custom middleware in your app

Soft TTL + Refresh

Items have a TTL, but when accessed near expiry, refresh in the background
Prevents cold starts and keeps hot items alive
Great for items accessed frequently but updated infrequently
Implement using async job queues or middleware hooks

Sharded or Namespaced Caches

Use key prefixes to separate cache scopes
Example: tenant-42:user:profile, locale-en:settings
Prevents key collisions and simplifies bulk invalidation
Use structured key naming conventions to support future automation

Caching Anti-Patterns

Caching user-specific or sensitive data globally That’s a GDPR violation waiting to happen.
Hardcoding long TTLs for things that change daily Congratulations, you’ve made your system fast and wrong.
Caching everything without a purge strategy Now you have a second database you also don’t understand.

TL;DR. Cache with Intent

Don’t optimize slow things— avoid doing them repeatedly.
Choose the right cache for the right problem
Design your cache busting before you go live
Monitor hit rates, not just cache size
Caching isn’t cheating. It’s how systems scale.

Your app isn’t just fast now. It’s efficient.
Your team isn’t chasing ghosts. They’re building confidently.
Postgres Pete? He’s finally taking lunch breaks again.

But don’t relax too long.
Because in Part 4, you’ll find out what happens when caching stops helping—

…and you need to start splitting your read and write workloads.

Stay tuned.

Next up: Scaling Reads Without Sharding Your Soul

DEV Community