Most people learn system design backwards. They start by reading how Netflix or Uber is built, memorize the finished architecture, and then freeze the moment they're asked to design something slightly different.
The fix is to learn the parts before the whole. Every large system is assembled from a small set of reusable building blocks. Once you understand each block on its own, designing a full system becomes an exercise in choosing which blocks to combine and why. This post covers the 10 to learn first, in a sensible order, so you have the vocabulary before you attempt your first design. If you want to see where these fit in the bigger picture, here's a structured roadmap to learn system design at your level.
How to study each block: solves, costs, when
Do not just memorize what each component is. For every block, learn three things:
- What it solves. The specific problem that makes you reach for it.
- What it costs. The new problem it introduces (every block adds one).
- When to use it. The signal in a design that says "now I need this."
That third habit is what separates people who can design real systems from people who can only recite them. Keep those three columns in mind as you read.
1. Load Balancer
What it is: A traffic cop that sits in front of multiple servers and spreads incoming requests across them.
What it solves: A single server can only handle so much. A load balancer lets you run many identical servers and distribute load, which gives you both scale and redundancy (if one server dies, traffic routes to the others).
What it costs: It becomes a component you have to make highly available itself, and it forces your servers to be stateless (or you have to deal with sticky sessions).
When to reach for it: The moment you say "one server isn't enough" or "what happens if this server goes down."
2. Caching
What it is: A fast, temporary store for data you access often, sitting between your app and a slower source (like a database).
What it solves: Latency and load. Reading from memory is orders of magnitude faster than hitting a database, and it takes pressure off your data store.
What it costs: Staleness. The cached copy can drift from the source of truth, so now you own cache invalidation, famously one of the hard problems in computing.
When to reach for it: Read-heavy workloads, or any time the same data is requested over and over.
3. Database Replication
What it is: Keeping copies of your database on multiple machines, usually one primary (writes) and several replicas (reads).
What it solves: Read scalability and availability. You can spread reads across replicas, and if the primary fails, a replica can take over.
What it costs: Replication lag. Replicas can be slightly behind the primary, so a user might not immediately see their own write. That's the consistency trade-off.
When to reach for it: When reads vastly outnumber writes, or when you can't afford to lose the database if one machine fails.
4. Database Sharding
What it is: Splitting one large database into smaller pieces (shards), each holding a subset of the data, spread across machines.
What it solves: Write scalability and storage limits. When the data or write volume outgrows a single machine, sharding partitions the load.
What it costs: A large jump in complexity. Cross-shard queries get hard, and choosing a bad shard key creates "hot" shards that defeat the whole purpose.
When to reach for it: When a single database server can no longer hold your data or absorb your writes, and only then. Replication first, sharding later.
5. Message Queue
What it is: A buffer that holds messages so one part of your system can hand off work to another without waiting.
What it solves: Decoupling and load smoothing. Producers and consumers work at their own pace, and traffic spikes get absorbed by the queue instead of crushing a service.
What it costs: Added latency (work is now asynchronous) and new failure modes (duplicate messages, ordering, what happens when the queue backs up).
When to reach for it: When work can happen later rather than instantly (sending emails, processing uploads, generating notifications).
6. Consistent Hashing
What it is: A clever way to distribute data across a changing set of servers so that adding or removing a server only moves a small fraction of the data.
What it solves: The rehashing problem. With naive hashing, adding one server reshuffles almost everything. Consistent hashing keeps that disruption minimal.
What it costs: Conceptual complexity, and the need for tricks like virtual nodes to keep the distribution even.
When to reach for it: Any time you're distributing data or requests across a cluster whose size changes (caches, sharded stores). I broke this one down step by step in consistent hashing in a system design interview.
7. CDN (Content Delivery Network)
What it is: A globally distributed network of servers that cache your static content close to users.
What it solves: Latency for a global audience. Serving an image or video from a city near the user is far faster than from one origin server across the world.
What it costs: Cost, plus cache invalidation again (pushing updates to edge locations takes time).
When to reach for it: When you serve static assets (images, video, JS/CSS) to users spread across regions.
8. Rate Limiter
What it is: A guard that caps how many requests a client can make in a given window.
What it solves: Abuse and overload. It protects your system from being overwhelmed, whether by malicious traffic, buggy clients, or a single user hammering an endpoint.
What it costs: You have to store and check counters fast (often in a cache), and you have to decide what to do with rejected requests.
When to reach for it: Any public API, login endpoint, or any resource you need to protect from spikes and abuse.
9. Database Indexing
What it is: A data structure that lets the database find rows without scanning the entire table.
What it solves: Slow reads. An index turns a full-table scan into a fast lookup, dramatically speeding up queries.
What it costs: Slower writes (every write must update the index) and extra storage. Over-indexing is a real mistake.
When to reach for it: When a query is slow and it filters or sorts on a specific column. This is often the cheapest performance win available.
10. Data Modeling
What it is: Deciding how your data is structured and stored: relational tables versus documents, how entities relate, what you optimize for.
What it solves: Almost everything downstream. The right model makes your common queries easy and fast; the wrong one makes every later decision harder.
What it costs: Time and foresight up front, and the reality that changing the model later is painful.
When to reach for it: First. Before you pick components, understand your data and its access patterns. This is the foundation the other nine blocks sit on.
How the blocks combine
Here's the payoff. Take a simple service and watch the blocks appear as constraints arrive:
- Start with one server and a database (data modeling).
- Traffic grows, so you add more servers behind a load balancer.
- Reads are slow, so you add a cache and some indexes.
- Reads still outpace one database, so you add replication.
- The data outgrows one machine, so you shard it (using consistent hashing).
- Some work doesn't need to be instant, so you offload it to a message queue.
- Users are global, so static assets move to a CDN.
- The public API needs protection, so you add a rate limiter.
Every block entered the design because a specific problem demanded it, not because a checklist said so. That's exactly how you should reason in a real design.
What to learn next
Learn these 10 in roughly the order above (data modeling first, sharding and consistent hashing last, since they're the most advanced). Write your own three-line summary for each: solves, costs, when. Then start composing them into full systems.
If you're brand new and even the prerequisites feel shaky, start with learning system design from scratch. And when you're ready to see how deep each block should go at your career stage, follow the full path in this roadmap by career level.
Master the parts, and the whole stops being intimidating. Every "complex" architecture you've ever admired is just these blocks, combined to answer specific constraints.


Top comments (0)