π§ The Ultimate System Design Interview Cheatsheet
System design interviews can feel overwhelming β there's a mountain of concepts, and you never know which ones will come up. I put together a visual cheatsheet that covers the most essential topics, organized so you can see the big picture at a glance. π
Here's a topic-by-topic breakdown of everything on it. π
1οΈβ£ Non-Functional Characteristics
Before designing anything, clarify the -ilities: availability, scalability, reliability, maintainability, latency, throughput, and consistency. These drive every architectural decision you'll make. π―
π‘ Interview tip: Always ask about expected scale (QPS, data size, latency SLAs) before diving into a design.
2οΈβ£ CAP Theorem
You can only guarantee two of three:
- π Consistency β every read gets the latest write
- β Availability β every request gets a response
- π Partition Tolerance β the system works despite network splits
In distributed systems, P is non-negotiable, so you're really choosing between CP (banking, inventory) and AP (social feeds, DNS).
3οΈβ£ Horizontal vs. Vertical Scaling βοΈ
| π Vertical | π Horizontal | |
|---|---|---|
| How | Bigger machine | More machines |
| Limit | Hardware ceiling | Theoretically unlimited |
| Cost | Exponential | Linear-ish |
| Complexity | Low | High (needs load balancing, data partitioning) |
Most production systems use horizontal scaling β it's the only way to handle massive traffic. ποΈ
4οΈβ£ DNS (Domain Name System) π
DNS translates human-readable domains to IP addresses. Key concepts:
- π Recursive resolvers do the heavy lifting
- β±οΈ TTL controls caching duration
- πΊοΈ Geographic DNS routes users to the nearest data center
For system design, think about DNS as your first layer of traffic routing. π£οΈ
5οΈβ£ Load Balancing βοΈ
Distributes traffic across multiple servers. Common algorithms:
- π Round Robin β simple rotation
- π Least Connections β route to the least busy server
- π IP Hash β sticky sessions by client IP
- βοΈ Weighted β more traffic to beefier servers
Works at Layer 4 (TCP) or Layer 7 (HTTP). Use health checks to automatically remove dead backends. π₯
6οΈβ£ API Gateway πͺ
A single entry point for all client requests. Handles:
- π Authentication & authorization
- π¦ Rate limiting
- π€οΈ Request routing & transformation
- π SSL termination
- π Logging & analytics
Think of it as the front door to your microservices architecture. π
7οΈβ£ Content Delivery Network (CDN) π
Caches static assets (images, CSS, JS, video) at edge locations close to users.
- β¬οΈ Push CDN β you upload content proactively
- β¬οΈ Pull CDN β fetches from origin on first request
Reduces latency dramatically. Pair with proper cache-control headers for best results. β‘
8οΈβ£ Caching πΎ
The fastest database query is the one you never make. π―
- π Browser cache β CDN cache β β‘ Application cache β π½ Database cache
- π οΈ Tools: Redis, Memcached
- π Strategies: Cache-aside, Write-through, Write-behind, Read-through
β οΈ Watch out for: cache invalidation (hard), thundering herd, and stale data.
9οΈβ£ Polling vs. WebSockets π‘
| π Polling | π WebSockets | |
|---|---|---|
| Direction | Client β Server | Bidirectional |
| Latency | Depends on interval | Real-time |
| Overhead | New HTTP connection each time | Single persistent connection |
| Use case | Email checks, dashboards | Chat, live feeds, gaming |
Long polling is a middle ground β the server holds the connection open until data is available. π
π Forward & Reverse Proxy π‘οΈ
- β‘οΈ Forward proxy β sits in front of clients (VPN, ad blockers, corporate firewalls)
- β¬ οΈ Reverse proxy β sits in front of servers (load balancer, API gateway, Nginx)
Both hide the real origin. Reverse proxies are a fundamental building block of scalable systems. π§±
1οΈβ£1οΈβ£ Consistent Hashing π
Solves the "what happens when we add/remove servers" problem.
- πΊοΈ Maps both servers and keys to a hash ring
- π When a server is added/removed, only K/N keys need to be remapped (not all of them)
- π οΈ Used in distributed caches, database sharding, CDNs
Virtual nodes improve even distribution across the ring. π«
1οΈβ£2οΈβ£ Database Types ποΈ
A quick taxonomy:
- π Relational (SQL): MySQL, PostgreSQL β structured data, ACID transactions
- π Document: MongoDB β flexible schemas, JSON-like storage
- π Key-Value: Redis, DynamoDB β blazing fast lookups
- π Column-Family: Cassandra, HBase β wide-column, high write throughput
- π Graph: Neo4j β relationships are first-class citizens
- β±οΈ Time-Series: InfluxDB β metrics, IoT data
π‘ Pick the right tool for the job. There's no "best" database.
1οΈβ£3οΈβ£ SQL vs. NoSQL βοΈ
| π SQL | π NoSQL | |
|---|---|---|
| Schema | Fixed | Flexible |
| Scaling | Vertical (mostly) | Horizontal |
| Transactions | Strong ACID | Eventual consistency (usually) |
| Joins | Native | Application-level |
| Best for | Complex queries, relationships | Scale, flexibility, speed |
Modern apps often use both β SQL for transactional data, NoSQL for caching/analytics. π€
1οΈβ£4οΈβ£ Database Scaling π
Two main strategies:
π Read Replicas
- π Copy data to multiple follower nodes
- π Reads spread across replicas
- βοΈ Writes go to the leader only
πͺ Sharding
- βοΈ Split data across multiple databases
- π¦ Each shard holds a subset of the data
- π§© Hard problems: cross-shard queries, rebalancing
1οΈβ£5οΈβ£ Indexes π
A B-tree (or hash index) that makes lookups O(log n) instead of full table scans. β‘
- π Single-column vs. π composite indexes
- π― Covering index β query answered entirely from the index
- βοΈ Trade-off: faster reads, slower writes (index maintenance overhead)
π‘ Rule of thumb: index columns used in
WHERE,JOIN, andORDER BY.
1οΈβ£6οΈβ£ Leader Election π
In distributed systems, you often need a single coordinator:
- π Raft β understandable consensus (etcd, Consul)
- π Paxos β the classic (harder to implement)
- ποΈ ZooKeeper β battle-tested coordination service
Used in database replication, distributed locks, and task schedulers. π
1οΈβ£7οΈβ£ Message Queues π¬
Decouple producers from consumers:
- π Kafka β high throughput, durable, great for event streaming
- π° RabbitMQ β traditional broker, flexible routing
- βοΈ SQS β managed, serverless-friendly
Benefits: buffering, async processing, retry logic, fan-out. π―
1οΈβ£8οΈβ£ Event-Driven Architecture β‘
Systems communicate through events rather than direct calls:
- π€ Event producer β π Event bus β π₯ Event consumer
- π Enables loose coupling and independent scaling
- π§© Patterns: Event sourcing, CQRS, Saga
Think: "When X happens, trigger Y" at scale. π
1οΈβ£9οΈβ£ Microservices π§±
Break a monolith into small, independently deployable services:
- π¦ Each service owns its data and logic
- π‘ Communicate via APIs or message queues
- βοΈ Trade simplicity for scalability and team autonomy
β When to use: large teams, independent scaling needs, polyglot tech stacks.
β When not to: small teams, early-stage products.
2οΈβ£0οΈβ£ Communication Patterns π‘
- π Synchronous: REST, gRPC, GraphQL β request/response
- β‘ Asynchronous: Message queues, event streams β fire and forget
- π gRPC β binary, fast, great for inter-service communication
- π― GraphQL β client specifies exactly what data it needs
2οΈβ£1οΈβ£ Rate Limiting π¦
Protect your system from abuse and overload:
- πͺ£ Token bucket β tokens refill at a fixed rate
- π Sliding window β counts requests in a rolling time window
- π§ Leaky bucket β processes at a constant rate
Implement at the API gateway level. Return 429 Too Many Requests with Retry-After header. π
2οΈβ£2οΈβ£ Idempotency π
The same request applied multiple times has the same effect as once.
Why it matters: network retries, message queue redelivery, double-clicks. π±οΈ
How: use idempotency keys β client sends a unique key, server deduplicates. π
π° Critical for payment systems and any write operation.
2οΈβ£3οΈβ£ Bloom & Cuckoo Filters πΈ
Probabilistic data structures for "is this element in the set?" π€
- πΈ Bloom filter β space-efficient, no false negatives, possible false positives
- π¦ Cuckoo filter β supports deletion, better false positive rates
Use cases: cache hit prediction, spam filtering, preventing duplicate writes. π―
2οΈβ£4οΈβ£ Single Point of Failure (SPOF) π
Any component whose failure brings down the entire system.
Eliminate SPOFs with:
- π Redundancy (multiple instances)
- π Failover mechanisms
- π₯ Health checks + automatic recovery
- π Geographic distribution
π£οΈ Interview mantra: "What happens when this component dies?" β οΈ
2οΈβ£5οΈβ£ Heartbeat π
Periodic "I'm alive" signals between components.
- π Server sends heartbeat to a monitor at regular intervals
- β° If heartbeat is missed β mark as unhealthy β trigger failover
- π οΈ Used in: leader election, cluster management, load balancer health checks
2οΈβ£6οΈβ£ Checksum β
Detects data corruption during transfer or storage.
- π MD5 β fast but not cryptographically secure
- π SHA-256 β secure, widely used
- β‘ CRC32 β fast, good for error detection
Applied at: file transfers, network packets, distributed storage verification. π
2οΈβ£7οΈβ£ Database Replication π
Copy data across multiple nodes:
- π Synchronous β writes confirmed after all replicas update (strong consistency, higher latency)
- β‘ Asynchronous β writes confirmed immediately, replicas catch up (eventual consistency, lower latency)
Leader-follower is the most common pattern. Multi-leader and leaderless for advanced use cases. ποΈ
2οΈβ£8οΈβ£ Database Sharding & Partitioning πͺ
- πͺ Sharding β horizontal split across databases/servers
- π Partitioning β split within a single database
Sharding strategies:
- π Range-based β by date, ID range
- π’ Hash-based β hash the shard key
- π Directory-based β lookup table
π§© Hard parts: rebalancing, cross-shard joins, hotspot avoidance.
π Final Thoughts
This cheatsheet covers the 28 core concepts that come up again and again in system design interviews. You don't need to memorize everything β focus on understanding when and why to use each one. π―
The real skill in system design isn't knowing the tools. It's knowing which tools to reach for, and being able to explain your tradeoffs clearly. πͺ
Good luck on your next interview. ππ₯
π¬ What system design topic do you find trickiest? Drop a comment below! π

Top comments (0)