I am currently reading one of the best books on the topic of System Design. I have always wanted to read this book, but for some reason, I keep putting it aside. I picked the book up again and have been committed to reading it for over a month now. The name of the book is Designing Data-Intensive Applications by Martin Kleppmann.
I will be picking up what I learned in the book and other places about System Scalability and sharing it in simple English, so anyone can understand it.
One of the core aspects of building a system people will use and trust is designing our system with its scalability in mind. I will be using System and Application interchangeably throughout this article.
Before we go further, what is System/Application Scalability?
In a simple term, scalability is a term we use to describe a system that will not crash or have reduced performance when loads are increased on it.
For example, we have an E-commerce website with an average concurrent usage of 10,000 people. What will happen to our application when it’s the festive period, more people are shopping on our system, and up to 100,000 people are on it simultaneously? The term for this can be increased load.
How do you manage this load, so your system can still provide the needed services as normal, even though more people are using it now?
Will people have to wait longer for the system to respond? Will the server go down (crash) because the load placed on it is beyond what it can currently hold?
The author suggests we ask ourselves 2 important questions when thinking about system scalability.
“1. If the system grows in a particular way, what are our options for coping with the growth?” and...
“2. How can we add computing resources to handle the additional load?”
Based on the questions above, the author proposed we don’t jump into making recommendations but rather do what he called Describing the Load and the Performance.
Describing Load:
To better understand how to scale a system, you have to know the current load the system takes only then can you answer the growth question of What happens if our load doubles?
Load in terms of System/Application can be described as the number of times functions are being used or our server resources are being utilised. The author has a suggestion for how we can determine what works for our specific system.
“The best choice of parameters depends on the architecture of your system:
it may be requests per second to a web server, the ratio of reads to writes in a database, the number of simultaneously active users in a chat room application, the hit rate on a cache, or something else. Perhaps the average case is what matters for you.” — Martin Kleppmann
What he is trying to explain here is we should find the scenario that works for our system. For some people, the system they are dealing with is a heavy read system, and for others, it would be a heavy write system, and the strategy will defer.
If your system is a heavy read system, adding a cache could reduce the number of times that a call is made to your database and in turn, reduce the time it takes to get a response.
Knowing what has been said here, let’s quickly look at the different ways we can scale our application, this will cut across many aspects of system scaling.
Vertical Scaling: This is also called scaling up, this is a type of system scaling we do by adding more computing power (CPU, RAM, etc) to what we currently have. This has a lot of limitations, e.g., cost increases, physical server limits, etc.
Horizontal Scaling: Also known as scaling out, it involves adding more machines or computing resources to a system to handle increased demand. Unlike vertical scaling, where you upgrade the resources (CPU, RAM, etc.) of a single machine, horizontal scaling distributes the load across multiple machines. Then we introduce a load balancer to manage these distributed machines, which ensures incoming requests are evenly spread across the available machines, reducing individual system load and enhancing scalability and fault tolerance.
Database Scaling: This is one of the most important aspects of scaling our system. I won’t be able to cover every aspect of this; I will write a full article about it because we have about 5 or 6 ways that we can scale a database alone.
Software Scaling: A system can be scaled by refactoring inefficient code, changing the data structure used, and/or algorithm of the previous implementation.
We can also change the design pattern used in developing the application.Elastic Scaling: This type of scaling is done by our cloud service providers, most service providers like AWS have an infrastructure (AWS Auto) where your system usage will determine the resources assigned to it.
A quick example would be a live-streaming application. Let’s use a football streaming application as a case study. If on average, the concurrent usage on the platform is 1,000,000 users, and the World Cup final is happening with Messi and CR7 playing in the finals, such a platform will expect to have more concurrent users than they have ever seen.
If their cloud infrastructure is designed to scale elastically, users will likely have a flawless experience because the cloud provider is assigning the needed resources based on demand. After the event has ended, usage will go back to the normal 1,000,000 concurrent users and resources will be reduced to usual.
A recent example would be the Mike Tyson vs. Jake Paul fight; you can search about the challenge Netflix faced the day the fight happened.
What I covered here is just the introduction for the topic, I would suggest getting the book I mentioned or getting a complete course on System Design because as you grow in your career, those are the things that will be expected you know.
If you have any comments, suggestions, or questions kindly leave them in the comment section and I will attend to them.
Top comments (0)