DEV Community

Nrapesh Khamesra
Nrapesh Khamesra

Posted on

System Design Interview Demystified

What distinguishes SDIs from other interviews?

Approaching systems design interviews strategically is crucial, just as it is with any other interview. Unlike coding interviews, systems design interviews (SDIs) typically do not require any actual coding. SDIs operate at a higher level of abstraction, involving the identification of requirements and the mapping of these onto computational components and high-level communication protocols that connect subsystems. The focus is not on the final answer, but rather on the process and the journey that a strong candidate takes the interviewer through.

What is the optimal approach for addressing a design question?

In a system design interview, design questions are often intentionally vague to reflect the open-ended nature of modern-day business. For example, an interviewer may ask how to design an application like WhatsApp, which has numerous features. However, it is not practical to include all features in the design due to limited interview time and the need to focus on core functionalities to demonstrate problem-solving skills.

Therefore, it is acceptable to inform the interviewer that not all features will be included in the design. If necessary, the plan of action can be adjusted accordingly. Best practices during a system design interview include:

  • Asking the right questions to clarify requirements.

  • Scoping the problem to attempt a solution within the limited time frame (usually 35 to 40 minutes).

  • Communicating with the interviewer to ensure they understand the thought process rather than silently working on the design.

Modern System Design: Bottom-Up Approach

In system design problems, there are often similarities in the underlying components, even if the specific details of each problem are unique. These similarities can be extracted and treated as basic building blocks. Few examples of such a building blocks are databases, load balancer, caches which are commonly used in various design problems.

By separating these building blocks, we can discuss their design in depth and then reuse them across different design problems without repeating ourselves. Think of these building blocks as bricks that we can use to construct more effective and capable systems.

Moreover, many of these building blocks are available for actual use in public cloud platforms like Amazon Web Services (AWS), Azure, and Google Cloud Platform (GCP). We can leverage such constructs to build a system and further cement our understanding. Although we won't be constructing a system in this course, interested learners can use this as an exercise.

Several of the fundamental components are

Domain Name System (DNS)

DNS is a hierarchical decentralized naming system that maps domain names to IP addresses, allowing users to access websites using human-readable domain names instead of IP addresses. DNS is essential for the functioning of the internet as it translates domain names into IP addresses, which are used to locate and connect to web servers. The DNS system consists of various components, including DNS servers, zones, resolvers, and registrars. DNS servers store and manage domain name records, while resolvers resolve domain names into IP addresses. Zones are administrative boundaries that divide the DNS namespace, and registrars are entities that manage domain name registration.

Load Balancers

Load balancers are devices or software systems that distribute incoming network traffic across multiple servers or resources. Their purpose is to improve the availability and scalability of applications, as well as enhance the performance of systems. Load balancers act as a mediator between the client and the servers, deciding which server should receive the next request based on preconfigured rules and algorithms.

Load balancers help prevent a single server from becoming overwhelmed with traffic and potentially failing, which can lead to downtime and disruption for users. By distributing traffic evenly across multiple servers, load balancers can improve the responsiveness and reliability of applications. Some load balancers can also perform advanced functions such as SSL termination and content caching.

Content Delivery Network

A Content Delivery Network (CDN) is a distributed network of servers located across the globe, designed to efficiently deliver web content, such as images, videos, and HTML pages, to users. The primary function of a CDN is to reduce latency by serving content from the server that is geographically closest to the user, thereby minimizing the time it takes for the content to reach the user's device. By caching content on multiple servers, a CDN also helps reduce the load on the origin server, and hence improve the overall performance and availability of the website. CDNs are widely used by content providers, e-commerce websites, and social media platforms to enhance the user experience and improve the speed and reliability of content delivery.

Databases

A database is an organized collection of data that can be accessed, managed, and updated quickly and easily. It provides a way to store and organize data so that it can be retrieved and manipulated as needed.
There are several types of databases, including:

  • Relational databases
    This type of database uses a structured format with tables, columns, and rows to organize data. Examples of relational databases include MySQL, PostgreSQL, and Microsoft SQL Server.

  • NoSQL databases
    This type of database is designed to handle unstructured and semi-structured data, such as JSON or XML documents. Examples of NoSQL databases include MongoDB, Cassandra, and Couchbase.

  • In-memory databases
    This type of database stores data in RAM instead of on disk, which allows for extremely fast access times. Examples of in-memory databases include Redis, Memcached, and Apache Ignite.

  • Graph databases
    This type of database is designed to handle highly connected data, such as social networks and recommendation engines. Examples of graph databases include Neo4j, Amazon Neptune, and Microsoft Azure Cosmos DB.

Blob Store

Blob store is used for storing large binary objects, also known as "blobs." Blobs are typically used to store unstructured data such as images, videos, audio files, and other multimedia data. A blob store allows for the efficient storage and retrieval of blobs, with the ability to scale horizontally to accommodate growing amounts of data. Blobs can be accessed and manipulated using APIs or other data access methods provided by the blob store. Popular blob store services include Amazon S3, Microsoft Azure Blob Storage, and Google Cloud Storage.

Caching

In distributed systems, caching refers to the process of storing frequently accessed data in a location that is closer to the user, such as in memory or on a local disk, in order to reduce the time and resources required to retrieve the data from a remote source. Caching can significantly improve the performance and scalability of distributed systems by reducing the load on the network and the remote servers. Caching can be implemented at different layers of a distributed system, including the application layer, the middleware layer, and the data storage layer. However, caching also introduces additional complexity and challenges, such as consistency, eviction, and cache invalidation. Therefore, caching strategies should be carefully designed and tested to balance the benefits and the costs of caching.

Logging

Logging is a process of recording events, transactions, or messages from various components of a distributed system. In a distributed system, logging is crucial for identifying and diagnosing issues, analyzing performance, and auditing. It helps to keep track of what is happening within the system, especially when something goes wrong. Logging can include various types of information such as error messages, exceptions, warnings, debug information, and application-specific data. The log data can be stored locally or remotely and can be analyzed using various tools to gain insights into the system's behavior.

Monitoring

In distributed systems, monitoring systems play a vital role as they aid in analyzing the system and notifying stakeholders of any issues. These systems can provide early warning signals, enabling system administrators to address problems before they escalate into major issues.

Server-side monitoring involves monitoring the health and performance of the server or servers running the application or service. This includes metrics such as CPU usage, memory usage, disk I/O, network usage, and more. Server-side monitoring helps identify bottlenecks, performance issues, and potential failures before they impact end-users.

Client-side monitoring involves monitoring the behavior and performance of the application or service on the client side, typically in a web browser or mobile app. This includes metrics such as page load time, network latency, errors, and more. Client-side monitoring helps identify issues that may be specific to certain devices, browsers, or user locations, and provides insights into the end-user experience.

Additional building blocks in distributed systems include messaging queues, pub-sub systems, task scheduling systems, rate limiters, and more. Further posts will delve into these building blocks in greater detail.

Top comments (0)