Gabrielle Niamat

Posted on Jul 3 • Edited on Jul 5

Navigating the New Grad SWE Job Hunt: System Design Interviews - Part 2

#systemdesign #architecture #interview #career

Part 4: The System Design Interview - Continued
1. Caching 🗂️
    1.1 What is Caching?
    1.2 Content Delivery Networks (CDNs)
2. Proxies & Load Balancers 🔀
3. Storage 🗄️
    3.1 SQL vs. NoSQL
    3.2 Object Storage
    3.3 Replication
    3.4 CAP Theorem & Consistency
    3.5 Sharding
4. Wrap Up

Part 4: The System Design Interview - Continued

Hi there, welcome back! 👋🏼 I recognize it's been over a year since my last post; life got in the way, and honestly between work and some personal stuff, writing just wasn't in the cards for a while. But I'm back for 2026, and I'm excited to finally close out this new grad guide before shifting my writing toward more intermediate-level content.

If you're new to this series, I'd recommend starting from the very beginning here, or at least checking out part 1, which covers computer and application architecture, networking (TCP/UDP, DNS), and APIs (REST, GraphQL, WebSockets, gRPC). This article picks up right where that one left off - covering caching, proxies, and storage to round out the fundamentals.

Alright, enough preamble. Let's get into it!

1. Caching 🗂️

1.1 What is Caching?

Caching is the practice of storing copies of data in a fast, temporary location so you don't have to fetch it from the original (slower) source every time. The two main benefits are improving performance and reducing load on your databases and downstream services.

Cache: A high-speed data storage layer that holds a subset of data so that future requests are served faster than by accessing the primary storage location.

Cache levels - Caching can happen at multiple layers:

Client-side: Browsers cache assets like images and scripts locally.
CDN: Edge servers cache content close to the user geographically (more on this below).
Server-side: The application caches frequently accessed data in memory using tools like Redis or Memcached.

Key terms:

Cache Hit: The requested data is found in the cache ✅
Cache Miss: The data isn't found — the system falls back to the primary source ❌
Cache Hit Ratio: The percentage of requests served from cache. Higher = better.

Write strategies — when new data is written, you need a way to keep the cache and database in sync:

Write-through: Written to cache and database simultaneously. Consistent, but slower writes.
Write-back (Write-behind): Written to cache first, database updated asynchronously. Faster writes, but risk of data loss if the cache goes down.
Write-around: Written directly to the database, bypassing the cache. Prevents caching data that likely won't be re-read soon.

Eviction policies — when a cache is full, something has to go:

LRU (Least Recently Used): Removes the item accessed longest ago.
LFU (Least Frequently Used): Removes the item accessed fewest times.
FIFO (First In, First Out): Removes the oldest item regardless of access frequency.

Popular server-side caching tools are Redis and Memcached. Redis is the more common recommendation since it supports richer data structures, optional persistence, and replication — making it useful beyond just caching (e.g., session management, pub/sub, leaderboards).

Cache invalidation (deciding when to remove or update stale data) is genuinely one of the trickier problems in distributed systems, and it's worth knowing about even if you never have to implement it yourself. The two most common approaches are setting a TTL (Time-to-Live) on cached entries so they expire automatically, or explicitly invalidating cache entries whenever the underlying data changes.

1.2 Content Delivery Networks (CDNs)

A CDN is a geographically distributed network of servers that caches static content — images, videos, CSS, JavaScript — and delivers it to users from the server closest to them, reducing latency.

Two main caching models:

Push CDN: Content is proactively pushed to edge nodes before it's requested. Best for content that doesn't change often.
Pull CDN: Content is fetched from the origin on the first request, then cached on the CDN going forward. Better for large sites with lots of assets.

2. Proxies & Load Balancers 🔀

A proxy is an intermediary server that sits between a client and a backend. There are two types:

Forward Proxy: Sits in front of the client — the server doesn't know who the original client is. Common use cases: VPNs, corporate firewalls.
Reverse Proxy: Sits in front of the server — the client doesn't know which backend is handling its request. Nginx is a popular example. In system design, when someone says "proxy," they typically mean this.

A load balancer is a type of reverse proxy that distributes incoming traffic across multiple servers. Common strategies:

Round Robin: Requests are sent to servers in rotation.
Least Connections: Traffic goes to whichever server has the fewest active connections.
IP Hashing: A client's IP is hashed to consistently route them to the same server — useful for maintaining session state.

Regular vs. Consistent Hashing

With regular (modular) hashing (hash(key) % num_servers), adding or removing a server causes almost every key to be remapped which is essentially a full reshuffle every time. Consistent hashing fixes this full N shuffle problem by mapping both servers and requests onto a virtual ring, so when a server is added or removed, only the keys that were assigned to that specific server need to be remapped. This minimizes disruption during scaling and is widely used in distributed caches and databases.

As an extra fact, reverse proxies also provide SSL termination, security (hiding internal server structure), caching, logging, and response compression — all in one place.

3. Storage 🗄️

3.1 SQL vs. NoSQL

One of the most common system design questions is: "What kind of database would you use, and why?"
SQL (Relational Databases) store data in structured tables with predefined schemas. Popular examples: PostgreSQL, MySQL. Most use B+ trees as their primary index structure, which keeps data sorted and supports efficient range queries.

SQL databases follow ACID properties:

Atomic: Every operation in a transaction succeeds completely, or none of it does. No partial writes.
Consistent: A transaction can only bring the database from one valid state to another — no rule-breaking in between.
Isolated: Concurrent transactions execute independently of each other, as if they were run sequentially.
Durable: Once a transaction is committed, it stays committed — even if the system crashes immediately after.

NoSQL (Non-Relational Databases) trade some structure and strict consistency for flexibility and horizontal scalability. Instead of ACID, most follow BASE:

Basically Available: The system guarantees availability — it will always return a response, even if it's stale or partial.
Soft State: The state of the system can change over time, even without new input, as replicas sync up.
Eventual Consistency: Given enough time, all replicas will eventually get the update/correct value.

3.2 Object Storage

Object storage is designed for large, unstructured files — images, videos, backups — rather than queryable data. Amazon S3 is the most widely used example. Unlike traditional file systems, there's no folder hierarchy — just a flat structure of objects, each with a unique key and metadata.

Key properties: highly durable, cost-effective at scale, accessible via HTTP API. Not suitable for frequently updated records or low-latency lookups.

3.3 Replication

Replication is the practice of copying data across multiple database instances to improve availability, read performance, and fault tolerance.

Synchronous replication: The primary waits for replicas to confirm the write before returning success. Always consistent, but slower.
Asynchronous replication: The primary confirms immediately; replicas catch up later. Faster, but there's a brief window of inconsistency.

Architectures:

Leader-Follower: One primary handles writes; replicas serve reads. Simple, but the primary is a single point of failure for writes.
Leader-Leader (Multi-Primary): Multiple nodes accept writes. More resilient, but introduces conflict resolution complexity.

3.4 CAP Theorem & Consistency

The CAP Theorem states that a distributed data store can only guarantee two out of three of the following properties simultaneously:

Consistency: Every read returns the most recent write, or an error.
Availability: Every request gets a response (not necessarily the freshest data).
Partition Tolerance: The system keeps running even if some nodes can't communicate.

Since network partitions are inevitable in real distributed systems, P is always required — so the real trade-off is between C and A:

CP systems (e.g. HBase): Refuse requests during a partition rather than return stale data.
AP systems (e.g. Cassandra, DynamoDB): Stay available during partitions but may return slightly outdated data.

This connects to the strong vs. eventual consistency trade-off: strong consistency guarantees every read reflects the latest write (higher latency); eventual consistency allows brief inconsistencies in exchange for better performance and availability.

3.5 Sharding

Sharding is a horizontal partitioning strategy where data is split across multiple databases (shards), each holding a subset of the total dataset. It's how you scale a database beyond what a single machine can handle.

Common sharding strategies:

Range-based: Partition by a value range (e.g. user IDs 0–25M on shard A, 25M–50M on shard B). Simple, but can lead to uneven load if some ranges are more active ("hot shards").
Hash-based: Hash the shard key to assign records. Distributes data evenly, but makes range queries harder.
Directory-based: A lookup table maps records to their shard. Very flexible, but the lookup table itself becomes a bottleneck.

The main downside of sharding is that cross-shard queries are expensive and complex. Re-balancing data when adding or removing shards is also non-trivial — another place where consistent hashing helps.

4. Wrap Up!

That wraps up the system design series! 🎉

Across both parts, we covered computer architecture, scalability fundamentals, networking, APIs, caching, proxies, and storage. As I mentioned in Part 1 — formal system design rounds are rare for new grad roles in Canada, but having a solid handle on these fundamentals will make a real difference when verbal technical questions come up, and it'll set you up well for your first few months on the job.

Start with the core concepts, practice explaining them out loud, and don't stress about memorizing every detail. The more you read, design, and discuss these systems, the more naturally it all clicks.

And honestly, I wouldn't worry too much about knowing these topics in detail - I was personally asked about them at a higher level and managers were often impressed that I knew some fundamentals a all!

This post was written with the help of AI but all thoughts and opinions are my own.

DEV Community