Scalability - the term which you must have come across while developing and designing an application. However, I will explain what scalability is with a simple example.
Let's consider that you run a small restaurant with only 4 staffs, capable to take up to 10 orders simultaneously. However, over a period of time, one particular dish became an overwhelming favorite, drawing more crowds, than usual, to your restaurant every evening. This is resulting in long queues in the waiting space as well as outside the restaurant gate. Moreover, the staffs are finding it difficult to manage.
As the owner, what should you do to tackle this situation?...
Definitely by the one or more of the following ways:
- hiring more staffs,
- increase the sitting capacity,
- extending the kitchen capacity,
- possibly, hire a manager to oversee all these. etc. etc.
You will basically need to think about all the possible ways to scale your business model to cater this sudden surge of orders.
Similarly, if you are maintaining a website or an application, depending upon the traffic it receives, you will need to ensure that your application scales to cater the incoming requests.
Technical Definition
Scalability denotes the ability of a system, application, or process to handle or its potential to handle an overwhelming amount of work.
In software design, scalability means how well an application can adapt to an increasing workload, such as surge in user count, higher transaction volumes, or larger data sets, without compromising its performance, reliability, or user experience.
Scaling Variations
There are two types of scaling:
- Vertical Scaling - Scaling Up/Down
- Horizontal Scaling - Scale Out/In
Additionally, there is a concept termed as Diagonal Scaling, which is basically a hybrid of Vertical Scaling and Horizontal Scaling. I have also discussed about another variation of scaling termed as Auto Scaling, used in Cloud platforms to scale applications.
1. Vertical Scaling (Scaling Up)
It denotes providing additional resources to a single server by increasing CPU power, memory (RAM), or storage capacity.
Once the traffic has been catered and you want to bring your server back to normal by reducing the added capacity is termed is Scale Down.
Pros:
- Easy to implement and manage a single server.
Cons:
- Limited by the capacity of a single machine.
- May lead to a single point of failure.
- Expensive due to the cost of high-end hardware.
Use Case:
Ideal for scenarios where applications with a smaller user base or where scaling requirements are moderate.
2. Horizontal Scaling (Scaling Out)
It involves adding more servers to your fleet of resources. This denotes that instead of upgrading your machines, you are adding more identical machines to cater to the increased load. Now, once the traffic is reduced, if you reduce the number of active servers in your server pool back to the previous count, this is termed as Scale In.
Pros:
- Wider scalability scope by infinitely adding more servers.
- Provides redundancy and high availability, as the failure of a single server doesn’t impact the entire system.
- Cost-effective for large-scale applications.
Cons:
- Implementation is complex as this requires a load balancing and distribution of data.
- May add to the latency due to network communication between servers.
Use Case:
Ideal for applications with a large and growing user base, or those requiring high availability and redundancy.
3. Diagonal Scaling
Diagonal scaling is a hybrid approach that combines both vertical and horizontal scaling. Initially, you scale vertically by adding resources to a single server. When that server reaches its capacity, you scale horizontally by adding more servers.
Pros:
- A flexible and cost-effective strategy.
- Allows you to maximize the capacity of individual servers in your pool before moving to a more distributed architecture.
Cons:
- The switching from vertical to horizontal scaling requires critical planning.
Use Case:
Ideal for applications that start with moderate resource needs but are expected to grow significantly.
4. Auto Scaling
This is a strategy mainly used in cloud environments, where resources are automatically scaled up or down based on real-time incoming traffic. AWS Auto Scaling helps you scale your applications hosted in AWS platform with a seamless experience.
Image Source: https://docs.aws.amazon.com/autoscaling/
Pros:
- Efficient management resources.
- Reduced costs by scaling down resources during low demand.
- Seamless user experience by handling traffic spikes automatically.
Cons:
- Requires monitoring and configuration to ensure it triggers at the right thresholds.
- Complex to set up and manage compared to manual scaling.
However, with managed services provided by Cloud platforms the disadvantages can be easily overcome.
Use Case:
Suitable for applications with highly variable traffic patterns, where demand can spike suddenly.
Summary
In this blog, we have learned about Scalability and its importance. We also learned about different types of scalability, their advantages-disadvantages and use cases.
This is not endorsed by any org, all my opinions.
For more contents, follow this page and follow my YouTube
Top comments (2)
very precisely written vlog and easy to understand from a beginnner's pov.
Thanks! Pleasure that you like it.