<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Preshy-Jones</title>
    <description>The latest articles on DEV Community by Preshy-Jones (@preshyjones).</description>
    <link>https://dev.to/preshyjones</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F359202%2Fd0a68d59-028f-4329-aecc-44080488e2ac.png</url>
      <title>DEV Community: Preshy-Jones</title>
      <link>https://dev.to/preshyjones</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/preshyjones"/>
    <language>en</language>
    <item>
      <title>Caching in Payment Systems</title>
      <dc:creator>Preshy-Jones</dc:creator>
      <pubDate>Thu, 30 Apr 2026 17:51:23 +0000</pubDate>
      <link>https://dev.to/preshyjones/caching-in-payment-systems-24c7</link>
      <guid>https://dev.to/preshyjones/caching-in-payment-systems-24c7</guid>
      <description>&lt;p&gt;Every backend engineer has heard the advice: add a cache. It will make your system faster. And they are right, it will. But a cache that is not properly understood is one of the most effective ways to take down your entire production system, including the database it was supposed to protect.&lt;br&gt;
This article covers everything you need to know about caching in a payment system: what it is, how it works, the patterns that govern it, the failure modes that will wake you up at 2am, and the decisions that separate engineers who use cache from engineers who understand it.&lt;/p&gt;

&lt;h2&gt;
  
  
  What a cache actually is
&lt;/h2&gt;

&lt;p&gt;A cache is a temporary storage layer that holds copies of data in memory so that future requests for that data can be served faster. The operative word is temporary. A cache is not a database. It is not your source of truth. It is a shortcut.&lt;br&gt;
Three properties define cache and everything else follows from them:&lt;br&gt;
Temporary: cached data has a finite lifespan and will eventually be deleted&lt;br&gt;
In memory: cache lives in RAM, not on disk. RAM access takes nanoseconds. Disk access takes milliseconds. That is a 1,000x speed difference.&lt;br&gt;
A copy: the real data lives in your database. Cache holds a duplicate for fast retrieval.&lt;/p&gt;

&lt;p&gt;In a payment platform, database traffic breaks down roughly as:&lt;br&gt;
95% reads: balance checks, transaction history, KYC status lookups, fraud checks&lt;br&gt;
5% writes: new transactions, status updates, profile changes&lt;/p&gt;

&lt;p&gt;Running all of this through one database means reads and writes compete for the same CPU, memory, and disk I/O. A heavy reconciliation job scanning millions of rows can starve your live payment processing of resources. Cache solves this by absorbing the read traffic before it ever reaches the database.&lt;/p&gt;

&lt;h2&gt;
  
  
  The cache-aside pattern
&lt;/h2&gt;

&lt;p&gt;The most common caching strategy is cache-aside, also called lazy loading. The flow has three steps:&lt;br&gt;
Web server receives a request&lt;br&gt;
Server checks cache first. If data is there (cache hit), return it immediately. Database never touched.&lt;br&gt;
If data is not in cache (cache miss), query the database, store the result in cache, return to client.&lt;/p&gt;

&lt;p&gt;The name lazy loading comes from the fact that data is only loaded into cache when someone actually requests it, not proactively. Your cache gradually warms up as users make requests. Cold cache after a restart means more database hits until the cache repopulates.&lt;br&gt;
In Redis, the implementation is straightforward:&lt;br&gt;
cache.set('user:USR_001:kyc_status', 'VERIFIED', EX, 21600)&lt;br&gt;
cache.get('user:USR_001:kyc_status')&lt;/p&gt;

&lt;p&gt;The three components of every cache write are the key (the unique identifier), the value (the data being stored), and the expiry (how long before it is automatically deleted). That expiry is called the TTL, Time To Live.&lt;/p&gt;

&lt;h2&gt;
  
  
  TTL: the expiration timer that governs everything
&lt;/h2&gt;

&lt;p&gt;TTL is the number of seconds before a cached value is automatically deleted. Once a key expires, the next request for it is a cache miss, triggering a fresh database query. The result is stored again with a new TTL and the cycle continues.&lt;br&gt;
Setting the right TTL for each piece of data is one of the most important caching decisions you make. Both extremes are wrong:&lt;br&gt;
TTL too short: cache expires constantly, every request goes to the database, you have gained nothing from having a cache&lt;br&gt;
TTL too long: cached data becomes stale, users see outdated information, which in a fintech means showing wrong balances or transaction states&lt;/p&gt;

&lt;p&gt;The right question for every piece of data is: if a user sees this value X seconds after it was written, does anything bad happen? That answer determines your TTL.&lt;br&gt;
In a payment platform, different data has very different tolerance for staleness:&lt;br&gt;
Transaction fee configuration: changes monthly. TTL of 1 hour is safe.&lt;br&gt;
User KYC status: changes rarely. TTL of 6 hours is acceptable.&lt;br&gt;
User session token: security-sensitive. TTL of 30 minutes maximum.&lt;br&gt;
OTP codes: single use, expires in minutes by design. TTL of 5 minutes.&lt;br&gt;
Wallet balance for display: changes on every transaction. TTL of 10 seconds at most.&lt;br&gt;
Wallet balance for authorisation decisions: never cache. Always read from the primary database.&lt;/p&gt;

&lt;p&gt;The most important rule in fintech caching: never use a cached value to authorise a financial transaction. Cache is for display. Authorisation always reads from the primary database.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cache consistency: the invisible problem
&lt;/h2&gt;

&lt;p&gt;When a user updates their data, you update the database. Then you update the cache. These are two separate operations. They cannot be wrapped in a single atomic transaction the way two database writes can be.&lt;br&gt;
If the database update succeeds but the cache update fails, your cache now holds stale data. Every request that hits cache gets the wrong answer. And it stays wrong until the TTL expires and forces a fresh database read.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why this is harder than it sounds&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In a microservices architecture, multiple services may write to the same database tables. If Service A updates a user's account tier and Service B has that tier cached, Service B will serve stale data until its TTL expires, regardless of what Service A did.&lt;br&gt;
This window where the database and cache disagree is called an inconsistency window. For most data it is acceptable and self-healing. For financial data it is unacceptable, which is exactly why wallet balances used for authorisation must never come from cache.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The consistency strategies&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Write-through: update the database and the cache in the same operation. If either fails, you retry until both succeed. Keeps cache consistent but adds latency to every write.&lt;br&gt;
Cache invalidation: when data changes, delete the cache key entirely rather than updating it. The next read will be a cache miss and will fetch fresh data. Simpler than write-through, but creates a spike of cache misses immediately after updates.&lt;br&gt;
TTL-based expiry: accept that the cache will be stale for up to X seconds and set your TTL accordingly. The simplest approach and perfectly valid for data where short-term staleness is acceptable.&lt;br&gt;
Facebook published a paper called Scaling Memcache at Facebook that is one of the most important pieces of engineering writing on cache consistency at scale. It is worth reading once you have completed the foundations of your systems design study.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cache as a single point of failure
&lt;/h2&gt;

&lt;p&gt;A single cache server is a single point of failure. If that one Redis instance goes down, every request that would have been served by cache is now a cache miss. Every cache miss becomes a database query. If your database was sized to handle, say, 5,000 queries per second with cache absorbing everything else, it suddenly receives 50,000 queries per second.&lt;br&gt;
The database was not built for this. It struggles. It dies. Your entire system is down.&lt;br&gt;
This is called a cache avalanche. Redis did not just fail. It took your database with it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How cache avalanche happens step by step&lt;/strong&gt;&lt;br&gt;
Redis goes offline&lt;br&gt;
Every cached key is gone&lt;br&gt;
Every incoming request is a cache miss&lt;br&gt;
Every cache miss becomes a database query&lt;br&gt;
Database receives a sudden flood of traffic it was never designed to handle&lt;br&gt;
Database CPU spikes to 100%&lt;br&gt;
Database response times grow to seconds&lt;br&gt;
Connection pool exhausted&lt;br&gt;
Database dies&lt;br&gt;
Total system outage&lt;/p&gt;

&lt;p&gt;This entire sequence can happen in under 60 seconds. Faster than any engineer can manually intervene.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Three things that prevent cache avalanche&lt;/strong&gt;&lt;br&gt;
First: run Redis in cluster mode across multiple availability zones. A cluster distributes your keys across multiple nodes, each with its own replica in a separate availability zone. When one node dies, its replica is promoted automatically. Your application continues serving cache hits without interruption. No avalanche.&lt;br&gt;
Second: overprovision memory. Never run Redis above 60% capacity. That 40% buffer absorbs two specific scenarios. Traffic spikes: when unexpected load brings more data to cache, Redis has room to absorb it without immediately evicting existing keys. Node failure redistribution: when one node dies, its keys redistribute to surviving nodes. If those nodes are at 90% capacity, the redistribution causes immediate mass eviction and a flood of cache misses. At 60% capacity, surviving nodes absorb the redistribution without evicting anything.&lt;br&gt;
Third: stagger your TTLs with random jitter. If you cache 100,000 keys all with a TTL of 3,600 seconds and they all expire at the same time, you get 100,000 simultaneous cache misses. Every one of them hits the database in the same second. To prevent this, add a random offset to each TTL:&lt;br&gt;
ttl = 3600 + random(0, 300)  // expire between 60 and 65 minutes&lt;/p&gt;

&lt;p&gt;Misses now spread across a 5-minute window instead of hitting simultaneously. The database handles them at a manageable rate.&lt;/p&gt;

&lt;h2&gt;
  
  
  Redis cluster: how data distributes across nodes
&lt;/h2&gt;

&lt;p&gt;When you run Redis in cluster mode, data does not just randomly distribute across nodes. Redis uses a deterministic system called hash slots to decide exactly which node stores which key.&lt;br&gt;
&lt;strong&gt;What hash slots are&lt;/strong&gt;&lt;br&gt;
Redis pre-divides the entire key space into exactly 16,384 slots, numbered 0 to 16,383. This number is fixed regardless of how many nodes you have. Every possible key maps to exactly one slot via a hash function:&lt;br&gt;
slot = CRC16(key) % 16384&lt;/p&gt;

&lt;p&gt;CRC16 is a standard algorithm that converts any string into a number. The same key always produces the same slot number. This determinism is what makes routing fast and reliable.&lt;br&gt;
&lt;strong&gt;How slots distribute across nodes&lt;/strong&gt;&lt;br&gt;
With three primary nodes, the 16,384 slots divide roughly equally:&lt;br&gt;
Primary Node 1 owns slots 0 to 5,460&lt;br&gt;
Primary Node 2 owns slots 5,461 to 10,922&lt;br&gt;
Primary Node 3 owns slots 10,923 to 16,383&lt;/p&gt;

&lt;p&gt;When you write a key, the Redis client library hashes it, calculates the slot, and sends the write directly to the node that owns that slot. No searching. No broadcasting. Pure math tells the client exactly where to go.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What happens when a client asks the wrong node&lt;/strong&gt;&lt;br&gt;
If a client sends a request to the wrong node, that node responds with a MOVED redirect:&lt;br&gt;
Client: GET user:USR_001  (sent to Node 1 by mistake)&lt;br&gt;
Node 1:  MOVED 5789 node2:7001&lt;br&gt;
Client: GET user:USR_001  (resent to Node 2)&lt;br&gt;
Node 2:  returns the data&lt;/p&gt;

&lt;p&gt;The client updates its local slot map so future requests go directly to the correct node. Cluster-aware Redis client libraries like Jedis (Java) or ioredis (Node.js) maintain this map automatically.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who assigns the hash slots&lt;/strong&gt;&lt;br&gt;
Hash slot assignment happens once at cluster creation time. The redis-cli cluster create command divides the 16,384 slots equally across your primary nodes and stores the assignment on every node. Every node in the cluster holds a complete copy of the slot map. There is no central coordinator. Any node can redirect any client to the right place.&lt;/p&gt;

&lt;h2&gt;
  
  
  Eviction policies: what happens when cache is full
&lt;/h2&gt;

&lt;p&gt;Redis has a fixed memory limit. When that limit is reached and new data needs to be stored, Redis must delete something to make room. The eviction policy controls which keys get deleted.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LRU: Least Recently Used&lt;/strong&gt;&lt;br&gt;
Delete whichever key has not been accessed for the longest time. The assumption is that recently accessed data will likely be accessed again soon. Data that has not been touched in hours is probably not hot.&lt;br&gt;
LRU is the most commonly used eviction policy and Redis's default. It works well because of a natural property of real-world access patterns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The 80/20 rule in cache access&lt;/strong&gt;&lt;br&gt;
In any application with many users and many pieces of data, a small minority of keys get requested constantly while the vast majority are rarely touched. In a payment platform:&lt;br&gt;
Transaction fee configurations: read on every single transaction. Extremely hot.&lt;br&gt;
Active user KYC status: read multiple times per session for active users. Hot.&lt;br&gt;
System configuration values: read constantly by all services. Hot.&lt;br&gt;
Transaction history for users who logged in once six months ago: cold.&lt;br&gt;
Profiles of dormant accounts: cold.&lt;/p&gt;

&lt;p&gt;Roughly 20% of your keys account for 80% of your cache reads. LRU naturally protects this hot 20% because those keys are accessed constantly, keeping their last-accessed timestamp recent. LRU will never evict them. It evicts the cold 80%, which is exactly the right behaviour.&lt;br&gt;
&lt;strong&gt;LFU: Least Frequently Used&lt;/strong&gt;&lt;br&gt;
Delete whichever key has been accessed the fewest total times, regardless of recency. Better than LRU when some data is genuinely hot long-term and you do not want it evicted just because it was not accessed in the last few minutes. More complex to implement and requires Redis 4.0 or later.&lt;br&gt;
&lt;strong&gt;FIFO: First In First Out&lt;/strong&gt;&lt;br&gt;
Delete whichever key was stored in the cache earliest, regardless of access frequency or recency. Simplest to implement, worst performing in practice. Evicts based on age not utility. Rarely the right choice for production systems.&lt;br&gt;
For most production payment systems, LRU with 60% memory provisioning is the correct starting configuration. Change it only when you have data showing a different pattern in your specific workload.&lt;/p&gt;

&lt;h2&gt;
  
  
  When NOT to cache
&lt;/h2&gt;

&lt;p&gt;The question of what to cache gets a lot of attention. The question of what never to cache gets almost none. In a payment system, this is where the real discipline lives.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Never cache wallet balances used for authorisation&lt;/strong&gt;&lt;br&gt;
A cached balance that is 10 seconds stale is fine for displaying to a user on their dashboard. It is catastrophic for authorising a debit. If a user makes two simultaneous transfers and both are authorised against a stale cached balance, you have a double spend. The authorisation check for money movement always reads from the primary database. Always.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Never cache OTP codes as the sole source of truth&lt;/strong&gt;&lt;br&gt;
OTPs must be verified against the value stored in the database or a dedicated time-based generation system. A cached OTP that persists beyond its intended expiry due to a TTL misconfiguration is a security vulnerability.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Never cache active transaction status&lt;/strong&gt;&lt;br&gt;
A transaction in flight has a status that changes rapidly: initiated, processing, authorised, settled, failed. Serving a cached status of processing for a transaction that has already failed or settled creates a confusing and potentially dangerous user experience. Active transaction status always comes from the database.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Never treat cache as your only storage&lt;/strong&gt;&lt;br&gt;
If Redis restarts, all in-memory data is gone. Any data that exists only in Redis and nowhere else is permanently lost. Every piece of data in your cache must have its source of truth in a persistent store. Cache is a copy. The database is the original.&lt;/p&gt;

&lt;h2&gt;
  
  
  Replication is not a backup
&lt;/h2&gt;

&lt;p&gt;This applies equally to cache and databases and is worth stating explicitly. Redis replication keeps a copy of your data on a replica node. If the primary dies, the replica is promoted and cache continues serving requests.&lt;br&gt;
But replication copies everything, including mistakes. If a bug in your application writes corrupted data to Redis, that corruption replicates to every replica. Replication does not protect you from data corruption. It protects you from infrastructure failure.&lt;br&gt;
For cache this is less critical because cache data is always regenerable from the database. But the mental model matters: replication is availability protection, not data protection.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary: the caching mental model for payment systems
&lt;/h2&gt;

&lt;p&gt;Cache is a temporary, in-memory copy of data from your database. It is not your source of truth.&lt;br&gt;
Use cache for data that is read frequently but changes infrequently.&lt;br&gt;
Set TTLs appropriate to how stale each piece of data can be. Add random jitter to prevent simultaneous expiry.&lt;br&gt;
Cache and database can become inconsistent. Design for this. Never use cached values for authorisation decisions.&lt;br&gt;
A single cache instance is a single point of failure. Run in cluster mode across multiple availability zones.&lt;br&gt;
Never run Redis above 60% capacity. The buffer protects you during spikes and node failure redistribution.&lt;br&gt;
Cache avalanche is real. Redis dying can take your database with it if you have not planned for it.&lt;br&gt;
LRU eviction with memory overprovisioning is the correct default for most payment systems.&lt;br&gt;
Wallet balances for authorisation, OTPs, and active transaction status should never come from cache.&lt;/p&gt;

</description>
      <category>systemdesign</category>
      <category>fintech</category>
      <category>distributedsystems</category>
      <category>redis</category>
    </item>
  </channel>
</rss>
