Technical product managers (TPMs) at big tech companies (FAANG) and startups are required to have a fundamental knowledge of System Design. Historically, System Design fundamentals were usually a requirement for software engineers during interviews, and TPMs were exempt from that expectation. However, the trend is now changing. As a TPM, you need to have a solid understanding of System Design in interviews and on the job as you lead a product team.
In this article, we'll discuss 5 of the most common fundamentals of System Design that you must know to succeed in your role in technical product management while leading an engineering team and rolling out great products.
- What is System Design, and why should you care as a TPM?
- Where to go from here on System Design
System Design is the process of architecting a system so that all functional and non-functional requirements are met, including APIs, use cases, and integrations. Even if you’re not directly responsible for the intricate details of this architecture as a TPM, you should understand the big-picture, how different system components support the goals of your organization, and meet the requirements of your products.
TPMs should have a fundamental knowledge of System Design to do their jobs well.
Fahim ul Haq, the CEO of Educative, has worked on distributed systems at FAANG companies for eight years. Having interviewed TPMs extensively, he agrees.
“A TPM should understand all the basic concepts of how scalable systems work and how different parts of a distributed system interact with each other at an abstract level to properly guide development. They need to know the core concepts and building blocks of the systems they are building.” – Fahim ul Haq (former systems engineer, CEO of Educative)
A TPM needs to know System Design fundamentals to make informed design decisions for their products. For instance, if you were to design Facebook’s photo storage system, you’d need to assign a specific ID to each image and have a system in place to uniquely identify every photo uploaded. If you know System Design, then you'd know that you'll need a sequence generator to achieve this function.
Therefore, as an effective TPM, you should aim to architect agile, scalable, reliable, maintainable, and robust systems that meet user requirements at any given time.
Next, we'll cover five fundamental concepts of System Design that are an absolute necessity for you, a TPM, to succeed in your job.
Load balancing is an integral part of the System Design lifecycle and refers to redistributing tasks across different computing servers to enhance system performance and reliability.
With millions of requests per second, load balancers will evenly spread tasks between available resources to ensure traffic flows smoothly.
- Improving efficiency: Load balancers distribute load traffic evenly among different servers, therefore improving efficiency and cutting down costs simultaneously.
- Availability of servers: If one or more servers break down, load balancers will bypass them and ensure the system remains available by distributing traffic among properly functioning servers.
- Scalability: Adding more servers ensures application capacity is increased concurrently via load balancing.
As a TPM, you will constantly be faced with situations where your servers need to be scaled up to meet user demand, or where there is a traffic surge and failure. In this scenario, a load balancer will come in handy.
Additionally, you must have the decision-making capacity to select a suitable load balancer algorithm for your development team depending on pricing, stakeholder commitment, and other variables.
A load balancer will help your system improve scalability, performance, availability, and will reduce redundancy. By ensuring server capacity can be altered simultaneously, failed servers are bypassed in preference for working ones and server load is distributed evenly.
A key-value store is a software storage system that builds on an associative array data model such as a hash table or dictionary to assign every key with a unique value in a collection. Values can be anything from unique ids, blobs, or server names.
It can be challenging to scale with traditional storage systems in distributed environments while still maintaining strong and consistent availability. Several top tech companies, including Facebook, Netflix, and Amazon, rely more on primary-key access data stores than traditional online transaction processing (OLTP) databases. By definition, OLTP is the rapid real-time execution of huge database transactions over the internet.
- Scalability: They can continuously process increasingly large amounts of data without a significant drop in performance.
Speed: Simple retrieval and usage commands like
- Flexibility: Scaling any large business model is easier because of the combined scalability and speed key-value stores offer.
When designing a system as a TPM, you should consider when and where you need to use a key-value store and why it may be the best choice at that given time. This model is beneficial for storing customer personalization data because of the scalability, speed, and flexibility that come with it.
For instance, you can increase processing performance on your systems because you will be working with datasets on multiple computers with more memory and also increase fault tolerance. Companies like LinkedIn, Amazon, and MongoDB have used key-value stores to scale significantly over the last couple of years.
A rate limiter ensures that a service responds only to a set number of requests. Anything beyond the predefined limits is throttled. For example, if an API for a service has been configured to handle only 200 requests per minute, any requests over that will be blocked.
- Cost efficiency: They help control operational costs, for instance, by preventing operational experiments from exceeding the set quota of server requests.
- Averting resource deprivation: Several denial of service (DoS) attacks that happen due to software configuration errors are prevented with rate limiting.
- Distributing data flow: Like load balancers, rate limiters ensure that systems are not overburdened with a large amount of data and help evenly spread the load among different servers when required.
In your role as a TPM, you want to ensure that your servers are running optimally and databases are not being compromised by slow performance. This is where an appropriate rate limiting algorithm can be applied.
Companies like Lyft make use of rate limiters to run their processes efficiently.
Content delivery networks are geographically distributed servers that work together to ensure quick and efficient content delivery over the internet. CDNs use caching as a mechanism to speed up the delivery of content across the web.
Content serviced by CDNs can be of several types, including website data, social media content, downloadable media, and so on.
Several organizations use CDNs to accelerate the delivery of content via the internet. A bank, for instance, might use a CDN to transfer sensitive data securely.
- Improving efficiency: CDNs enhance web page load times while simultaneously cutting down bounce rates. This keeps a user on the page and prevents them from abandoning it.
- Enhancing security: By mitigating distributed denial-of-service (DDoS) attacks, CDNs play a massive role in boosting security.
- Cutting down on bandwidth costs: Because CDNs primarily rely on caching and other optimizations, they can significantly reduce server bandwidth, keeping hosting costs down for website administrators and owners.
If your organization is content-heavy, you, as a TPM, may find a CDN helpful to employ in some instances. You will be able to reduce data load times and latency, reduce redundancy, boost security and reduce bandwidth expenses hence saving time and costs for the organization.
Traditional file systems come with many disadvantages, so databases are often preferred. A database is a collection of data organized in a way that is easily accessible, maintainable, manageable, and structured so that it can be updated and processed efficiently.
Databases come in two main types:
Relational databases are collections of datasets organized in multiple tables, columns, and records. Relational databases communicate with each other via database tables. Structured query language (SQL) is used to manipulate and retrieve information from these databases with commands like
Non-relational databases (NoSQL) typically store unstructured data in a different format from relational databases. NoSQL databases have several types, including graph, key-value, document, and wide-column.
- Data consistency: Databases will ensure that data redundancy is eliminated and changes made are reflected in the database immediately hence no inconsistency in data.
- Data integrity: By ensuring all users are presented with correct and accurate information, data integrity is maintained.
- Data security: Several security features including password and user authentication help maintain security of data in databases.
Every organization works with databases in this digital era to scale their business and improve workflows and efficiency. Your role as a TPM may often require you to wear the hat of a data product manager too. Here, you may be required to oversee the entire lifecycle of how data is distributed and used within an organization. This is where a strong background in data science and working with databases will help you shine.
Databases have several advantages depending on the type you choose, including data integrity, data consistency, data security, data persistence, and ease of access, among others.
You can learn more about databases as a building block in System Design from this database design tutorial.
Did you have fun learning about the fundamentals of System Design? Whether you are just thinking about starting a career in technical product management or are already in the field, we hope this article helped you navigate the complexities of System Design as you build scalable software products.
But we have just scratched the surface of this topic. More System Design fundamentals not discussed in this article but absolutely necessary for you as a TPM to learn and master include:
- Domain Name Systems (DNS)
- Distributed Caching
- Publish-Subscribe Systems
- Sharded Counters
- Distributed Messaging Queues
- Distributed Task Scheduling
- Distributed Logging
To help you master System Design fundamentals in-depth with building blocks, don’t forget to check out Educative’s fully hands-on Grokking Modern System Design for Software Engineers & Managers course, which has a comprehensive overview of System Design.
What other System Design fundamentals does a TPM need to know? Was this article helpful? Let us know in the comments below!