Rajkiran

Posted on May 31

System Design - 4. Consistency Models

#systemdesign #distributedsystems #architecture #softwareengineering

Consistency Models Explained: Why Your Bank and Your Twitter Feed Play by Different Rules

The $440 Million Bug
On August 1, 2012, Knight Capital Group — one of the largest market makers on Wall Street — deployed new trading software. Within 45 minutes, their system executed millions of erroneous trades.

The root cause? Inconsistent state across servers. One server had the old software. Seven had the new. The old server kept seeing stale data and acting on it.

By the time humans intervened, Knight Capital had lost $440 million. The company was effectively bankrupt within hours.

This is the most expensive consistency bug in history. And it happened because the engineers hadn't thought clearly enough about one question:

When two parts of a distributed system see different data, which one is right — and what should each one do about it?

That question is the heart of consistency models.

What Is Consistency? (The Intuition)
In a single-server system, consistency is trivial. You write a value, you read it back, you get what you wrote. Simple.

In a distributed system — where data is replicated across multiple servers — consistency becomes a genuine engineering challenge.

Imagine two replicas of a database:

Primary DB: balance = $500 (just received a $200 deposit)

Replica DB: balance = $300 (hasn't received the update yet)

User reads from Replica → sees $300

User is confused: "Where's my $200?"

This is an inconsistency — two parts of the system disagree about reality.

Consistency models define the rules about when and how updates become visible across a distributed system. The spectrum runs from strongest (everyone always agrees) to weakest (everyone eventually agrees).

The Consistency Spectrum
Think of this as a dial. Turn it toward strong — you get perfect agreement but pay in latency and availability. Turn it toward*_ eventual_* — you get speed and availability but accept temporary disagreement.

STRONG ◄─────────────────────────────────────► EVENTUAL

│ │

│ Strong → Causal → Read-your-writes → │

│ Monotonic reads → Eventual (weakest) │

│ │

Slow, expensive, Fast, cheap,

always consistent temporarily inconsistent

Let's walk each level.

1. Strong Consistency (Linearizability)
What it means: After a write completes, every subsequent read — from any server, anywhere — returns that written value.

Write: balance = $500 (at 10:00:01)

Any read after 10:00:01, from any server → returns $500

_The guarantee: _It's as if there's only one copy of the data. All operations appear to happen instantaneously in a global order.

Cost: Every write must be acknowledged by all replicas before returning. This adds latency. If any replica is unreachable, the write blocks.

Who uses it: Banking systems, financial ledgers, inventory management (you can't sell the same seat twice), distributed locks.

Real example: Google Spanner achieves strong consistency globally using synchronized atomic clocks (TrueTime) — a genuine engineering marvel. But it's expensive to operate and has higher latency than eventually consistent systems.

2. Causal Consistency
What it means: Operations that are causally related are seen in the right order by everyone. Unrelated operations can be seen in any order.

The intuition: If you post a comment in response to a post, nobody should ever see your comment without first seeing the post you're responding to.

Alice posts: "Anyone want pizza?" (event A)

Bob replies: "Yes! Where?" (event B, caused by A)

Causal consistency guarantees: no server shows B without first showing A.

Unrelated post by Carol can appear in any order relative to A or B.

Who uses it: Collaborative tools, social feeds where causality matters, distributed version control (Git is causally consistent by design).

3. Read-Your-Writes Consistency
What it means: After you write something, you will always read your own write back. Other users might still see stale data, but you won't.

Why it matters: Imagine posting a tweet and then refreshing your profile — and your tweet isn't there. Infuriating. Read-your-writes prevents this.

How it's implemented: Route all reads for a user to the same replica where they last wrote. Or store the write timestamp in a session cookie and only read from replicas that are at least that current.

Who uses it: Most social applications. When you update your profile photo, you see it immediately — even if others see the old one for a few more seconds.

4. Monotonic Reads Consistency
What it means: If you read a value at time T, you will never read an older value at any time after T.

The problem it prevents: Without this guarantee, a user refreshing their feed could see newer posts, then older posts, then newer ones again — depending on which replica serves each request. The feed appears to go backward in time.

Without monotonic reads:

Refresh 1: sees posts up to 3:00pm ✓

Refresh 2: sees posts up to 2:45pm ✗ (went backwards!)

Refresh 3: sees posts up to 3:05pm ✓

With monotonic reads:

Refresh 1: up to 3:00pm ✓

Refresh 2: up to 3:00pm or later ✓ (never goes back)

Who uses it: Any system where reads come from multiple replicas — databases with read replicas, CDN-cached content.

5. Eventual Consistency (The Weakest)
What it means: If no new writes happen, all replicas will eventually converge to the same value. But in the meantime, different replicas can return different values.

The trade-off: Maximum availability and performance. No coordination required between replicas.

The mental model: Think of DNS. When you change your domain's IP address, it doesn't update everywhere instantly. Some DNS servers have the old IP, some have the new one. But eventually — usually within 24-48 hours — every DNS server in the world agrees. That's eventual consistency.

Who uses it: Social media feeds, shopping cart contents, view/like counts, DNS, CDN caches, any system where brief staleness is acceptable.

Facebook's famous example: When you post something, your friends in different regions might see it at slightly different times. One friend in Mumbai sees it before your friend in São Paulo. Neither is "wrong" — they're just seeing eventually consistent replicas catch up in real time.

Quorum: How You Tune Consistency
Here's one of the most powerful ideas in distributed systems. You can precisely control how consistent your system is by tuning quorum.

_Setup: _Imagine N = 3 replicas (common in Cassandra, DynamoDB).

The magic formula:

W + R > N → Strong Consistency

Where:

N = number of replicas

W = number of replicas that must confirm a write

R = number of replicas that must respond to a read

Walking through configurations:

W=3, R=1 (Write-heavy strong consistency):

Write: must be confirmed by all 3 replicas ← slow writes

Read: only need 1 replica to respond ← fast reads

W + R = 4 > 3 ✓ → Strong consistency

Cost: writes are slow (all 3 must ack)

W=1, R=3 (Read-heavy strong consistency):

Write: confirmed by just 1 replica ← fast writes

Read: must check all 3 replicas ← slow reads

W + R = 4 > 3 ✓ → Strong consistency

Cost: reads are slow (must check all 3)
_
W=2, R=2 (Balanced):_

Write: 2 replicas must confirm

Read: 2 replicas must respond

W + R = 4 > 3 ✓ → Strong consistency

Most common production configuration — balanced trade-off

W=1, R=1 (Eventual consistency):

W + R = 2 ≤ 3 → Eventual consistency

Maximum speed, minimum guarantees

The insight: Quorum lets you tune consistency like a dial. Cassandra, DynamoDB, and Riak all expose these W and R parameters directly. You choose your trade-off per operation — strong for payment confirmation, eventual for updating a "last seen" timestamp.

Vector Clocks: How Systems Track Causality
When multiple nodes write to the same key concurrently, which write wins? This is the distributed systems equivalent of "two people edited the same Google Doc at the same time."

Vector clocks solve this by tracking the causal history of each write.

How it works:

Each node maintains a counter per node it knows about.

Node A writes: {A:1, B:0, C:0} → value="red"

Node B writes: {A:0, B:1, C:0} → value="blue"

Are these concurrent?

A's clock doesn't include B's write. B's clock doesn't include A's write.

→ They're concurrent writes. Conflict detected.

→ System can merge (if data allows) or flag for user resolution.

Node C reads A's write, then writes:

{A:1, B:0, C:1} → value="green"

This is causally AFTER A's write (includes A:1).

→ No conflict with A. C's write wins over A's.

Amazon Dynamo uses vector clocks for shopping cart data. If you add items to your cart on your phone, and separately on your laptop, the system uses vector clocks to detect the concurrent writes and merges the two carts. That's why you sometimes see duplicate items in your Amazon cart — it's the merge surfacing.

When someone asks "should I use strong or eventual consistency?" the right question is: "What is the business cost of showing stale data?"

For a bank balance: showing stale data means someone might overdraft, or fraudulent transactions might go undetected. The cost is catastrophic. Use strong consistency.

For a tweet's like count: showing 1,249 likes instead of 1,250 for 2 seconds? Nobody cares. Use eventual consistency.

Interview Scenario: "Real Failure from Inconsistency"
The Knight Capital story is perfect here. But let's also talk about a subtler, more common failure:

The double-spend problem:

User has $100. Makes two simultaneous purchases of $80 each.

Without strong consistency:

Request 1 → Replica A: reads $100 → approves → deducts → balance $20

Request 2 → Replica B: reads $100 (stale!) → approves → deducts → balance $20

Result: User spent $160 with only $100. System lost $60.

This is why payment systems use strong consistency with distributed locks or ACID transactions. The latency cost is worth it.

The quorum answer to this:

Use W=2, R=2 with N=3.

Both requests must read from 2 replicas.

At least one replica will have seen the first transaction.

One of the reads returns the post-deduction balance → second request blocked.

Key Takeaways
Consistency in distributed systems is a spectrum, not a binary choice.
Strong consistency — everyone always agrees, but it's slow and expensive. Use for financial data.
Eventual consistency — fast and available, but temporarily stale. Use for social feeds, caches, view counts.
Read-your-writes — you always see your own updates. Essential for good UX.
Monotonic reads — time never goes backwards. Essential for readable feeds.
Quorum (W + R > N) — the tuning knob that lets you dial consistency precisely.
Vector clocks — how distributed systems detect and resolve concurrent conflicts.
The business cost of stale data determines which model to use — not engineering preference.

Top comments (1)

ANP2 Network • May 31

The frame that stale-data cost (not preference) picks the model is the right one. One sharpening on the double-spend example: a quorum read makes the read fresh, but it doesn't make read-then-decide-then-write atomic. If two withdrawals both do their quorum read before either writes the deduction, both see a sufficient balance and both proceed — freshness didn't help because the race is between the read and the write, not between replicas. For anything irreversible, read consistency is necessary but not sufficient; the fix is to move the decision into the write — a conditional write / compare-and-set on a version, so the precondition and the effect commit as one operation and the loser retries. The pattern that sidesteps it entirely is to stop reading-then-acting: append the intent (a withdrawal event) to a single per-account stream and let one deterministic consumer fold the balance — total order per stream gives the consistency for free, and there's no TOCTOU window because nobody acts on a bare read. Strong vs eventual is really a question about reads; double-spend is a question about writes, and they're easy to conflate.