You deployed a cache in front of your database three weeks ago. The DB is still running at 90% utilization. Traffic doubled last month and you're wondering if the cache is doing anything at all.
It is — just not as much as you expected, because cache hit rate is not something you configure. It emerges from two things: how much of your working set fits in memory, and how skewed your access patterns are.
Paperstack is a free system design simulator that makes this visible. Sketch an architecture, press play, watch utilization numbers and node colors update live. The demo below walks through the cache problem using it.
Hit Rate Isn't a Setting
A cache absorbs reads by serving them from memory instead of forwarding them to the database. The fraction it absorbs — hit rate — depends on one thing: whether the data a request needs is in memory.
Two variables determine that:
Working-set vs memory. If your active data is 100,000 keys and your cache holds 50,000, only half the requests can possibly hit — the rest miss and forward to the DB. Your cache isn't broken. It's undersized for the working set.
Access skew. If 80% of requests hit 10% of keys (common in social or content workloads), a much smaller cache can achieve a high hit rate because the hot keys stay warm and rarely get evicted. Paperstack models this directly via the skew parameter on the Cache node: with an LFU eviction policy, higher skew boosts hit rate beyond raw memory coverage. With LRU, skew gives no benefit — the eviction algorithm doesn't take access frequency into account, so cold pages get evicted as readily as hot ones.
This is why you can't just set hitRate: 0.9 in the config panel — Paperstack doesn't expose hit rate as a field. You set memory and workingSet; hit rate is computed. If the simulation let you enter a hit rate directly, it would be lying to you about what your architecture actually does.
The Comparative in Action
Here's the experiment worth running in Paperstack: sketch Traffic → App → Cache → Database. Run the simulation. Watch which nodes go red.
With cache memory smaller than workingSet, most reads miss and forward to the DB. The DB stays red. The cache stays green — it has throughput headroom, it's just not absorbing much.
Now increase the cache memory past the working-set size. The hit rate climbs. Fewer reads reach the DB. At some threshold the DB color shifts from red to orange to green — and a different node becomes the bottleneck. Maybe the App Server. Maybe the cache's own throughput cap.
This is what Paperstack calls the comparative: when you change one variable, the bottleneck doesn't disappear — it moves. That relocation is the lesson. Scaling the cache fixed your DB problem and revealed your next one.
The inverse is just as instructive. Remove the Cache node and rerun. Watch the DB immediately redline. This is how you build intuition for what a cache is actually doing — not by reading about hit rates, but by watching the utilization delta before and after.
When Write Policy Matters
Paperstack exposes three write patterns on the Cache node: Cache-aside, Write-through, and Write-behind. The choice affects both latency and what happens when you kill the cache node.
Cache-aside (the default) separates reads from writes entirely. Reads check the cache first; misses go to the DB. Writes bypass the cache and go directly to the DB. The cache is populated on read-miss, not on write. Kill the cache: reads start missing entirely, DB load spikes, but the write path was already going to DB — no disruption there.
Write-through keeps the cache and DB in sync on every write. Writes pay the cache's latency and the DB's latency on the write path, making writes more expensive than cache-aside. Kill the cache: reads fall through to DB, but every write was already reaching the DB, so nothing is lost.
Write-behind is where the kill scenario gets interesting. In this mode, the cache absorbs writes entirely — they never reach the DB during normal operation. Only read-misses reach the DB. The DB is effectively shielded from the write load.
Kill the cache node in write-behind mode: Paperstack's passThroughOnKill behavior makes the cache transparent — all traffic falls straight through. The DB suddenly receives the write workload that was never reaching it before. If the DB was sized assuming writes were handled by the cache, it may not have the writeCap headroom to absorb the sudden change. The simulation shows this directly as DB utilization spiking and requests dropping.
This failure mode is invisible on a static architecture diagram. The diagram shows cache → DB regardless of write policy. The simulation shows what breaks.
Conclusion
The cache-doesn't-help problem is usually a mismatch between memory and working set, not a configuration error. Once hit rate is computed from real inputs rather than typed in, the DB utilization behavior makes sense.
Paperstack makes the relationship between working-set size, cache memory, write policy, and DB utilization visible without deploying anything. Sketch the architecture, tune the numbers, kill nodes, and watch the bottleneck move. When the DB finally turns green, you know exactly why.
Try it at Paperstack — it runs in the browser, no account needed.
Key Takeaways
- Cache hit rate emerges from
memoryvsworkingSetand accessskew— it's computed, not configured. Undersizing cache memory caps hit rate regardless of traffic volume. - The comparative (change one variable, watch the bottleneck move) is how you build cache intuition: adding cache makes the DB green, revealing the next bottleneck.
- LFU eviction benefits high-skew workloads (popular keys stay warm); LRU does not —
skewonly matters with the right eviction policy. - Write-behind shields the DB from writes during normal operation; kill the cache and the DB suddenly receives the write load it was never sized for.
- Write-through and Cache-aside are safe to kill (writes were already reaching DB); write-behind changes the DB's workload profile on failure.
Top comments (1)
The working-set point is the part teams miss. Adding a cache is not the same as creating cacheable traffic. If the hot set does not fit, or access is too flat, the DB will still carry the pain.