System Designing has become one of the important rounds to crack a Software Developer job interview, especially for a senior-level position. Preparing for this round is necessary to crack interviews of some of the popular companies like Amazon, Netflix, Google, Twitter, etc.
In this round, you are basically expected to discuss the design of a large-scale distributed system like Twitter, Uber, Facebook, Dropbox, etc.
There is no one right solution when it comes to designing a system. There can be multiple ways to solve a problem and thus this round is going to be an open-ended round where the focus will be on a working design + your thought process.
Due to the lack of experience or knowledge in building scalable systems in everyday work, a lot of developers struggle with this round. Even after being a good Software Developer, they fail to get into good companies just because they fail to clear this round.
So it's important to focus on this round and prepare well to get into your dream company.
Great engineers spend years of time building a robust and scalable system. You cannot possibly come up with a similar solution in a short duration of 1–1.5 hours in a System Design interview.
So generally only a part of the entire system is discussed in this round.
As mentioned earlier there is no one right solution in designing a good system. This round is generally open-ended and discussion can go in any direction based on what the interviewer is interested in, how you lead the discussion, and in what direction you take it.
- Design a chat service like WhatsApp
- Design a parking lot
- Design a URL shortener service like TinyURL
- Design a video streaming service like YouTube/Netflix
- Design a file sharing service like Google Drive
- Design a Social Media platform like Instagram, Twitter, or Facebook.
These are some of the popular questions that are asked in this round. As you can see the questions are vague and the problem statement doesn't provide specific details on what part of the system should be designed.
So, it's important to know how to approach this round to come up with a good design at the end of this discussion.
System Design is generally split into two separate rounds.
But again, it depends on the company and its process. So be prepared to design both high-level and low-level components of a system in this round.
- High-level Design
- Low-level Design
This round mainly checks your ability to architect & design high-level components for the given requirements.
For example, given a problem statement like
Design a social media platform like Instagram, you need to come up with the different microservices you will need, the pub-sub mechanism (if needed), queues, databases, caching, etc.
The interviewer will ask questions on how data would flow through different microservices in the design, fault tolerance, retry mechanism, etc. You can also expect questions around non-functional requirements like scalability, data consistency, concurrency, etc.
Building a distributed scalable system is hard. You need to think about different scenarios while coming up with an architecture.
- Key Characteristics of Distributed Systems
- Load Balancing
- Data Partitioning
- Redundancy and Replication
- SQL vs. NoSQL
- CAP Theorem
- Consistent Hashing
- Long-Polling vs WebSockets vs Server-Sent Events
This round mainly focuses on your ability to design low-level components of your HLD. Given a problem statement you should come up with a design with different entities, classes & attributes, inheritance, composition, design patterns, databases, tables & schema, etc.
You cannot possibly cover all of these in one interview round. The interviewer may be interested in anyone depending on how the discussion goes.
- Use case diagram
- Class diagram
- Database design
- Sequence diagram
- Activity diagram
- Separation of concerns
- OOP principles
- SOLID principles etc
Here is my 8 step guide to approaching System Design rounds effectively.
System Design problem statements are vague. And there's no one right answer. So, it's important to ask a lot of questions and clarify the scope of the discussion.
Candidates who ask a lot of questions have a better chance of success. For example here is a list of questions you can ask for the Design Instagram question:
What are the different types of accounts possible?
A: Only users. We will not consider business accounts for now
What type of posts can a user post?
A: Let's consider users can post only images for this discussion. Videos are out of scope.
Can users follow other users?
Should we support tags for each post?
Can posts be private?
A: Let's assume posts can only be public for this discussion.
Can users search for posts?
Should we focus on the client-side architecture or the server-side architecture?
A: I would like to understand the client-server interaction on a high level. But the focus should be more on building a scalable backend system.
How many users are expected to use the system every day?
A: Let's assume we have around 100M users logging in every day.
Should we focus on generating a user's home feed?
A: Let's not focus on the feed generation algorithm
Should we focus on the high-level or low-level design of the system?
A: Let's discuss the high-level design of the system in this discussion
These are some of the example questions you can ask to set the scope of the discussion. As you see now we know we need to focus on building the high-level design of Instagram. We don't need to consider users posting videos. We don't need to consider generating users' home feeds in an intelligent way. etc
Once we have the requirements finalized, the next step is to come up with a list of APIs we will need to build the system.
You should define the different REST APIs and their contract to support the requirements given.
For example in our Instagram example, here are some example APIs we will need:
storePhoto(user_id, tags, image_url, user_location,…)
Of course these are example APIs. The APIs used in a real-world application are much more complicated.
Your system should have enough storage and resources to handle the expected load and number of users who might use your system in 1 year from now, 5 years from now, etc.
So it's important to estimate how much resources you need to allocate when designing a system to avoid running into problems in the future.
If you have a database in your system design, cache in your system, the capacity estimation process generally involves calculating how much memory is used on a daily basis across all the databases and cache clusters.
This gives us a good estimation of how much memory we will need in years from now.
Capacity estimation also helps in the way we design our system so that it is horizontally and vertically scalable for higher capacity requirements in the future.
Here's a calculation of how much storage is needed to store photos in our Instagram example:
We have 100M users log in every day based on our assumption above
Let's assume 1M users upload an average of 2 photos every day
2M photos are uploaded every day
This computes to 23 photos added every second
Let's assume an average photo size of 400KB
Storage needed each day: 2M * 400KB => 800 GB
Storage needed for 5 years:
*800GB * 365 (days a year) * 5(years) ~= 1425TB
Storage needed for 10years:
800GB * 365 (days a year) * 10(years) ~= 2850TB
At this point, it helps to list down all the high-level components that are involved in the system and how they interact with each other.
It can be microservices, databases, caches, messaging queues, low-cost storage services, etc.
For our Instagram example, we can create two separate microservices. One service is responsible for uploading the photos. Another service for retrieving the photos.
What's the point of two separate microservices you may ask?
The reason for this is the upload photos requests are generally lower than reading photos requests. So we can scale the services separately based on the traffic demands.
We can even use low-cost storage to store photos and a SQL metadata database to store the details about a photo like uploaded_by, location, image_name, size, created_at, etc.
You should be able to explain how the data would flow through these high-level components once you have this ready.
In DB design we decide what kind of database suits our use case: A SQL or NoSQL and also the entities and the relationships between them.
If we're building a banking application where the movement of money is involved we cannot use NoSQL databases that are eventually consistent in nature.
Similarly, we cannot use a SQL database where data changes very frequently. In our Instagram example, we can use a NoSQL database to store metadata details for a post.
Because a post may have properties like likes, image_url, created_at, created_by, etc. But in the future let's say we want to introduce other features like dislike, tags, etc, it can be easily done using a NoSQL DB without any schema changes.
In class design, we come up with the low-level class entities which share the same responsibilities, relationships, operations, attributes, and semantics.
We basically list down the classes, their attributes, methods, and their relationship with other classes.
The way you're evaluated depends on whether you're using the right object-oriented concepts, principles, and design patterns. A good system is designed keeping all of these in mind: Abstraction, Encapsulation, Inheritance & Polymorphism.
It's also important to learn about the SOLID principles (The First 5 Principles of Object-Oriented Design). These principles will help you design software that is easier to maintain and extend as it grows.
- S - Single-responsibility Principle
- O - Open-closed Principle
- L - Liskov Substitution Principle
- I - Interface Segregation Principle
- D - Dependency Inversion Principle
A cache is a high-speed, in-memory data storage layer to store data for faster access in the future. It's a means to improve the overall system performance. Caching is important in any system that needs to scale.
We basically cache the data that is frequently used. Doing this will result in applications taking lesser time compared to accessing the data from a database every time.
In our Instagram example, we can cache the metadata of photos like created_by, location, created_at, etc since these parameters are never going to change for a given photo. Other parameters like comments, likes, etc cannot be cached since they are changing frequently.
The moment we start using cache in our design we also need to think about the eviction policies. Since caches are costlier than database storage we have to keep a check on the amount of data we store in a cache. So cache eviction becomes important.
There are many eviction policies like LRU (Least Recently Used), FIFO (First In First Out), LFU (Least Frequently Used), etc.
Often times it happens that all the data in our application cannot be stored on a single database server. In cases like these, we need to come up with strategies to store the data on multiple database servers.
Here we basically split the large set of data into multiple chunks (logical partitions) and we store these chunks on multiple nodes.
Sharding is a good technique to scale a growing application.
There are many techniques to partition data:
- Range based partitioning
- Hash-based partitioning
- Directory-based partitioning
- Vertical partitioning etc
You should read about these in order to understand which partitioning schemes are suitable in which scenarios.
This 8 steps approach gives you a framework to approach any system design question. But remember that the interviewer may be interested in any one of these steps in which case you should be able to focus only on that step and figure out the details.
Many candidates try to focus on building a scalable system before even they have a design for a working system. Focus on building the system first and then think about scaling it.
Start with writing down the high-level components we need for the system we are building. Then drill down on any of the lower-level components the interviewer suggests.
This keeps your mind clear and helps you design a better system during the interview.
You should mention all the assumptions you're making during the interview. It's important that you and the interviewer are on the same page.
Keep both functional and non-functional requirements in mind when designing the system.
Sometimes it becomes difficult to support non-functional requirements like consistency, availability, data durability, etc.
Some systems need a core algorithm to be implemented for the system to work. For example, if you want to designs a TinyURL-like system, you need to come up with an algorithm to generate the tiny URLs for a given long URL.
Not many system design questions need a complex algorithm to be implemented.
The same system can be designed in completely different ways based on the scale needed. So, clarify the scale you're designing the system for at the beginning of the discussion.
Here are the best courses that I recommend for preparing for System Design interviews:
- Grokking the System Design Interview
- Grokking the Object Oriented Design Interview
- Tech Dummies Narendra L
- System Design by Gaurav Sen
- Low Level Design | The Code Mate
- Think of the System Design interview as an open-ended discussion with your colleague.
- You cannot possibly come up with the best possible solution for a system in the given 1–1.5 hours.
- Clarify all the requirements and set the scope for the discussion.
- Make sure you and the interviewer are on the same page throughout the discussion.
- Learn how to estimate the data capacity required in the next 5–10 years and design the system for the capacity expected.
- Mention all the assumptions you are making during the interview.
- Given a problem statement, try to think of all the high-level components (services, databases, caches, message queues, etc) first. Then figure out the lower-level details of the system.
- It is important to learn how to design classes, databases, and APIs.
- It's is important to learn Object-Oriented and SOLID principles for low-level design interviews.
- The interviewer wants to know how you approach a given problem and the thought process behind it. As long as you are able to justify why a component is designed in a certain way, you're good.
- Practice solving system design problems. That's the only way to do well in this round.
The article was originally published on my blog. You can find it here.
You can connect with me on twitter where I usually share my knowledge more frequently on topics like Software Development, freelancing, creating multiple passive income streams etc.