System Design interviews can feel overwhelming, but they always boil down to a few first principles. In this post, we’ll cover four foundational topics:
- Scalability – Vertical vs Horizontal
- Latency & Throughput – P50, P95, P99
- Capacity Estimation – QPS, storage, bandwidth
- Networking Basics – TCP, HTTP/HTTPS, TLS, UDP, DNS
1. Scalability: Vertical vs Horizontal
TL;DR: Scale vertically until you hit diminishing returns, then go horizontal. Default to stateless services for easy horizontal scaling.
- Vertical scaling: Bigger machine (more CPU, RAM, disk). Easy but capped. Expensive and risk of single point of failure.
- Horizontal scaling: More machines (stateless servers behind a load balancer). Harder to manage but scales better.
👉 Interview tie-in: They’ll ask “How would you scale this system to handle 10x users?” Answer with horizontal scaling + stateless services.
2. Latency & Throughput
TL;DR: Latency = how long one request takes. Throughput = how many requests per unit time.
-
Latency percentiles:
- P50 (median): What most users see
- P95: The "tail" – 5% of users worse than this
- P99: Worst case for 1% of users, often drives design
Tradeoff: Optimizing tail latency may hurt throughput, and vice versa.
👉 Analogy: Coffee shop.
- Latency = how long it takes to serve one customer.
- Throughput = how many coffees per minute.
👉 Interview tie-in: "We want 95% of requests under 200ms." That’s P95 latency target.
3. Capacity Estimation
TL;DR: Always estimate load before designing. Even rough math is better than guessing.
- QPS (Queries Per Second): How many requests per second your service must handle.
- Storage: How much data accumulates per day/month/year.
- Bandwidth: How much data moves across the network.
👉 Example:
- 1M daily active users
- Each makes 10 requests/day → ~115 QPS
- Each request ~1KB → ~10MB/day traffic
👉 Interview tie-in: They’ll push you to "estimate scale" before proposing an architecture. Show your math, even if it’s rough.
4. Networking Basics
TL;DR: Most systems fail because people ignore networking bottlenecks.
- TCP: Reliable, ordered, connection-based
- UDP: Fast, connectionless, may drop packets (used for video/games)
- HTTP/HTTPS: Request-response on top of TCP; HTTPS adds TLS encryption
-
DNS: Resolves
google.com
→ IP address - TLS handshake: Extra latency on first connection
👉 Interview tie-in: Expect questions like "Why use UDP for video streaming instead of TCP?"
✅ Takeaways
- Prefer stateless horizontal scaling
- Measure both latency (P95) and throughput
- Always estimate capacity before proposing a design
- Know your networking trade-offs
💡 Practice Question:
"Design a URL shortener for 50M users. How many requests/sec do you expect? What’s your scaling plan?"
Top comments (0)