DEV Community

Sibasish Mohanty
Sibasish Mohanty

Posted on • Edited on

Scaling Isn’t Magic: Fundamentals They Expect You to Know

System Design interviews can feel overwhelming, but they always boil down to a few first principles. In this post, we’ll cover four foundational topics:

  1. Scalability – Vertical vs Horizontal
  2. Latency & Throughput – P50, P95, P99
  3. Capacity Estimation – QPS, storage, bandwidth
  4. Networking Basics – TCP, HTTP/HTTPS, TLS, UDP, DNS

1. Scalability: Vertical vs Horizontal

TL;DR: Scale vertically until you hit diminishing returns, then go horizontal. Default to stateless services for easy horizontal scaling.

  • Vertical scaling: Bigger machine (more CPU, RAM, disk). Easy but capped. Expensive and risk of single point of failure.
  • Horizontal scaling: More machines (stateless servers behind a load balancer). Harder to manage but scales better.

👉 Interview tie-in: They’ll ask “How would you scale this system to handle 10x users?” Answer with horizontal scaling + stateless services.


2. Latency & Throughput

TL;DR: Latency = how long one request takes. Throughput = how many requests per unit time.

  • Latency percentiles:

    • P50 (median): What most users see
    • P95: The "tail" – 5% of users worse than this
    • P99: Worst case for 1% of users, often drives design
  • Tradeoff: Optimizing tail latency may hurt throughput, and vice versa.

👉 Analogy: Coffee shop.

  • Latency = how long it takes to serve one customer.
  • Throughput = how many coffees per minute.

👉 Interview tie-in: "We want 95% of requests under 200ms." That’s P95 latency target.


3. Capacity Estimation

TL;DR: Always estimate load before designing. Even rough math is better than guessing.

  • QPS (Queries Per Second): How many requests per second your service must handle.
  • Storage: How much data accumulates per day/month/year.
  • Bandwidth: How much data moves across the network.

👉 Example:

  • 1M daily active users
  • Each makes 10 requests/day → ~115 QPS
  • Each request ~1KB → ~10MB/day traffic

👉 Interview tie-in: They’ll push you to "estimate scale" before proposing an architecture. Show your math, even if it’s rough.


4. Networking Basics

TL;DR: Most systems fail because people ignore networking bottlenecks.

  • TCP: Reliable, ordered, connection-based
  • UDP: Fast, connectionless, may drop packets (used for video/games)
  • HTTP/HTTPS: Request-response on top of TCP; HTTPS adds TLS encryption
  • DNS: Resolves google.com → IP address
  • TLS handshake: Extra latency on first connection

👉 Interview tie-in: Expect questions like "Why use UDP for video streaming instead of TCP?"


✅ Takeaways

  • Prefer stateless horizontal scaling
  • Measure both latency (P95) and throughput
  • Always estimate capacity before proposing a design
  • Know your networking trade-offs

💡 Practice Question:

"Design a URL shortener for 50M users. How many requests/sec do you expect? What’s your scaling plan?"

Top comments (0)