Scaling Isn’t Magic: Fundamentals They Expect You to Know

#systemdesign #softwaredevelopment #interview #career

System Design interviews can feel overwhelming, but they always boil down to a few first principles. In this post, we’ll cover four foundational topics:

Scalability – Vertical vs Horizontal
Latency & Throughput – P50, P95, P99
Capacity Estimation – QPS, storage, bandwidth
Networking Basics – TCP, HTTP/HTTPS, TLS, UDP, DNS

1. Scalability: Vertical vs Horizontal

TL;DR: Scale vertically until you hit diminishing returns, then go horizontal. Default to stateless services for easy horizontal scaling.

Vertical scaling: Bigger machine (more CPU, RAM, disk). Easy but capped. Expensive and risk of single point of failure.
Horizontal scaling: More machines (stateless servers behind a load balancer). Harder to manage but scales better.

👉 Interview tie-in: They’ll ask “How would you scale this system to handle 10x users?” Answer with horizontal scaling + stateless services.

2. Latency & Throughput

TL;DR: Latency = how long one request takes. Throughput = how many requests per unit time.

Latency percentiles:
- P50 (median): What most users see
- P95: The "tail" – 5% of users worse than this
- P99: Worst case for 1% of users, often drives design
Tradeoff: Optimizing tail latency may hurt throughput, and vice versa.

👉 Analogy: Coffee shop.

Latency = how long it takes to serve one customer.
Throughput = how many coffees per minute.

👉 Interview tie-in: "We want 95% of requests under 200ms." That’s P95 latency target.

3. Capacity Estimation

TL;DR: Always estimate load before designing. Even rough math is better than guessing.

QPS (Queries Per Second): How many requests per second your service must handle.
Storage: How much data accumulates per day/month/year.
Bandwidth: How much data moves across the network.

👉 Example:

1M daily active users
Each makes 10 requests/day → ~115 QPS
Each request ~1KB → ~10MB/day traffic

👉 Interview tie-in: They’ll push you to "estimate scale" before proposing an architecture. Show your math, even if it’s rough.

4. Networking Basics

TL;DR: Most systems fail because people ignore networking bottlenecks.

TCP: Reliable, ordered, connection-based
UDP: Fast, connectionless, may drop packets (used for video/games)
HTTP/HTTPS: Request-response on top of TCP; HTTPS adds TLS encryption
DNS: Resolves google.com → IP address
TLS handshake: Extra latency on first connection

👉 Interview tie-in: Expect questions like "Why use UDP for video streaming instead of TCP?"

✅ Takeaways

Prefer stateless horizontal scaling
Measure both latency (P95) and throughput
Always estimate capacity before proposing a design
Know your networking trade-offs

💡 Practice Question:

"Design a URL shortener for 50M users. How many requests/sec do you expect? What’s your scaling plan?"

DEV Community