CAP theorem says a distributed system can give you Consistency, Availability, or Partition tolerance. Pick any two. Not three. Not “mostly three.” Two. The internet doesn’t care about your optimism.
Now let’s talk about read replicas, because this is where confusion usually starts.
The real-world story
Imagine you run an e-commerce app. Nothing fancy. Users browse products, add to cart, place orders. You have one main database. Life is peaceful. Traffic grows. Database starts sweating. So you do the obvious grown-up thing: add read replicas.
One primary database handles writes. Multiple read replicas handle reads. Suddenly your system feels faster. You feel smarter. LinkedIn post draft opens automatically.
Then a user places an order and immediately refreshes the order page. Surprise: the order is missing. Panic. Bug? Race condition? Database haunted?
No. Welcome to CAP.
What just happened
The write went to the primary database.
The read went to a replica.
Replication is asynchronous, because synchronous replication would make your system slow and fragile.
So for a short time:
- Primary has the new order
- Replica does not
This is eventual consistency in action. The system chose:
- Availability: reads always work
- Partition tolerance: network issues won’t kill you
- And it sacrificed strong consistency
Read replicas don’t break CAP. They make a choice inside CAP.
So what’s the “solution” in read replicas?
Short answer: there is no single solution. There are trade-offs, and you pick based on business needs, not ego.
Long answer, explained like a normal human.
Option 1: Accept eventual consistency (most common)
This is what companies like Amazon, Netflix, Instagram do for many flows.
You allow stale reads for a short time.
You design UI and flows knowing data may lag.
Examples:
- Show “Order placed, updating details…” message
- Refresh after a few seconds
- Avoid instant read-after-write expectations
This gives you:
- High availability
- Massive scale
- Calm databases
You lose:
- Immediate correctness everywhere
This is not laziness. This is engineering maturity.
Option 2: Read-your-writes consistency (smart routing)
For some critical flows, you cheat a little.
If a user just wrote data:
- Route their next read to the primary
- Or use a session flag like “read from leader for 5 seconds”
This is common in:
- Banking dashboards
- Order confirmation pages
- Profile update screens
Now you get:
- Consistency where it matters
- Availability everywhere else
Downside:
- More complexity
- Primary database gets more load
Still worth it.
Option 3: Synchronous replication (rare, expensive)
You force replicas to confirm writes before success.
Now reads are always consistent.
You also get:
- Higher latency
- Lower availability
- System crying during network hiccups
Used only when:
- Incorrect data is worse than downtime
- Think financial ledgers, not social feeds
Most systems wisely avoid this.
The key realization
Read replicas don’t solve CAP.
They let you choose where you want to bend.
CAP is not a bug to fix.
It’s gravity.
Good systems don’t fight it. They design around it.
The adult takeaway
- If you want scale, you accept eventual consistency
- If you want correctness, you sacrifice availability
- If you want both, you add complex routing logic
- If you want all three, you’re selling a course, not building software
Read replicas are not magic.
They’re a very honest trade.
And honesty is rare in distributed systems.
References (the boring but trustworthy kind)
Eric Brewer, “CAP Twelve Years Later: How the Rules Have Changed”
https://www.infoq.com/articles/cap-twelve-years-later-how-the-rules-have-changed/Martin Kleppmann, Designing Data-Intensive Applications
https://dataintensive.net/Amazon Dynamo Paper
https://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdfGoogle Cloud Docs on Replication and Consistency https://docs.cloud.google.com/datastore/docs/concepts/structuring_for_strong_consistency
Netflix Tech Blog on Eventual Consistency https://netflixtechblog.com/s3mper-consistency-in-the-cloud
Top comments (0)