Priyan Jeyaram

Posted on Mar 15

Public Learning - Day 0

#systemdesign #architecture #learning #learnwithai

Day 0: Cracking the System Design Code 🏗️

Welcome to my learning log! I'm documenting my journey into the world of high-scale systems. Feel free to roast my logic in the comments or, you know, gently correct me so I don't build a digital house of cards.

1. The Fundamentals (The "Don't Break the App" Phase)

Performance vs. Scalability

Performance: How fast is it right now?
- Metrics: Throughput (Req/sec), Response Time, CPU/Memory Usage.
Scalability: How well does the system handle growth?
- The Test: If performance tanks the moment load increases, your system doesn't scale.

Latency vs. Throughput

Latency: The "waiting" time per request (e.g., 1 API call = 120ms).
Throughput: How many requests can we shove through the pipe per second? (e.g., 567 RPS).

The Batch Paradox: A system with a 10-second latency that processes 50,000 jobs/sec is perfectly valid. High latency doesn't always mean a "bad" system—it depends on the goal.

Availability vs. Consistency (CAP Theorem)

In a distributed world, you usually have to pick a side:

Availability (AP): Every request receives a response (even if the data is slightly stale).
Consistency (CP): Every node returns the exact same data at the same time (even if it means blocking requests until sync is done).

The Bottleneck "Golden Rule"

"A system is only as fast as its slowest component."
(Spoiler: It’s usually the Database.)

🧠 Brain Teasers: Putting Theory into Practice

Q1: The Sudden Traffic Spike

Scenario: You have 1 server (8 cores) handling 2k RPS. Suddenly, traffic hits 20k RPS. How do you scale using pure infrastructure (no code changes)?

The "Interview-Ready" Answer:

Vertical Scaling: "Scale Up" by adding more CPU/RAM to the existing machine.
Horizontal Scaling: "Scale Out" by adding more servers behind a Load Balancer.
Partitioning: Split the workload via Sharding or Regional Routing.

Q2: The Throughput Showdown

Which system is superior?

System A: 10ms latency, processes 1 request at a time.
System B: 200ms latency, processes 200 requests in parallel.

The Math:

System A: $1 / 0.01s = 100$ req/sec.
System B: $200 / 0.2s = 1,000$ req/sec.

Winner: System B. Even though it's "slower" per request, its throughput is 10x higher due to parallelism.

Q3: WhatsApp: Messages vs. Money

Should a messaging app prioritize Consistency or Availability?

The Verdict: Availability (AP).
During a network partition, the system must still accept messages. They can synchronize later (Eventual Consistency).
Note: If you're building WhatsApp Pay, the rules change—you must be CP (Consistent) because nobody likes "eventual" money.

Q4: The Need for Speed

Flow: Client → API (5ms) → Redis (1ms) → PostgreSQL (50ms). What is the minimum latency?

The Math:
In a Cache Hit scenario, we skip the slow DB.
API Logic (5ms) + Redis (1ms) = 6ms.
Minimum latency is 6ms.

Q5: The Bitly Bottleneck

Scenario: 50M redirects/day. Where does the first bottleneck appear?

The Reality Check:
50M/day is roughly 580 RPS. A single modern server handles this easily. However, as you scale, the Database becomes the bottleneck because every redirect requires a lookup.
Solution: Introduce Redis caching so frequently accessed URLs never even touch the DB.

Q6: The "Insta-Feed" Problem (Push vs. Pull)

How do we generate a feed for 200M users?

Pull Model: Fetch posts from all 500 people you follow on login.
- Result: Massive Fan-in query. 2500+ posts fetched per login = DB Meltdown.
Push Model: When someone posts, "Fan-out" that post to every follower's pre-built feed table.
- Result: Fast reads, but Celebrity Meltdown. If a star with 600M followers posts, the system has to perform 600M writes instantly.

The Instagram Hybrid:

Normal Users: Push (Fan-out) to feeds.
Celebrities: Use the Pull model. We only fetch their posts when a user actually opens the app and merges them into the feed.

Q7: The "Viral Tweet" Like Counter

Do you UPDATE tweets SET likes = likes + 1 or store every like as a row?

The Surprise Winner: Storing every row (Option B) scales better.

The Problem with Option A:

Row Locking: The DB locks the row during every update.
Write Contention: 100k people liking a tweet at once = 100k updates fighting for one row. The DB chokes.

The Pro Architecture:

Write: Store each like in its own row (No locking, just appending).
Cache: Use a Distributed Counter in Redis (INCR tweet:id:likes).
Extreme Scale: Use Sharded Counters where likes increment one of many "buckets" to avoid hitting a single Redis hotkey.

What am I missing? If you've handled a 600M follower fan-out lately, let me know how much sleep you lost in the comments!

DEV Community