Scaling and Load Balancer

Scaling

Process of increasing capacity of a system or application in order to handle more traffic or data
Either add more resources, such as servers or storage, or optimize the existing resources.

Why we need scaling ?

Increase in traffic or data as they grow can lead the system to become overloaded, have slow performance, downtime or even crash.
Scaling can improve performance by distributing the work load across multiple servers.
Faster response times and reduced downtime would result in better user experience.
System would become future-proof. As the traffic/workload increases, a highly scaled system would still continue to meet the demands of customers and users efficiently.

Types of Scaling

Horizontal Scaling:

Increasing or decreasing number of nodes in a system.

One can say, increasing/decreasing number of machines to handle the work load.

Also known as, scaling out.

Vertical Scaling:

Increasing or decreasing the power of the existing system.

Increasing/ decreasing CPU and Storages of existing system.

Also known as, Scaling Up.

Horizontal vs Vertical

	Horizontal Scaling	Vertical Scaling
Workload Distribution	Workload is distributed across multiple machines. Every machine has some part of data.	A single machine handles the entire workload.
Concurrency	Distributes multiple jobs across multiple machines. Reduces the workload on each machine.	Relies on multi-threading on the existing machine to handle multiple requests.
Complexity and maintenance	Higher, since one would need to maintain multiple machines.	Lower
Load Balancing	Necessary to actively distribute workload across the multiple nodes	Not required in the single node
Failure resilience	Low because other machines can offer backup	High since it’s a single source of failure
Communication	Network calls (Remote Procedural calls) which is slow	Inter process communication which is fast.
Data maintenance	Data inconsistency	Consistent
Limitation	Add as many machines as you can	There is hardware limitation

Which should you choose?

Both horizontal and vertical scaling have their own benefits and limitations. Since there isn’t a one-size-fits-all solution for organizations, you need to scale according to your needs and resources.

Cost - Initial hardware costs for horizontal upgrades are higher. If you are working on a tight budget and need to add more resources to your infrastructure quickly and cheaply, then vertical scaling may be the best option for you.
Future-proofing - Adding additional updated machines through horizontal scaling will increase the overall performance threshold of your organization. There is a limit to how much you can vertically scale a single node, and it may not be able to handle the demands of the future.
Reliability - Horizontal scaling may offer you a more reliable system. It increases redundancy and ensures that you are not relying on a single machine. If one machine fails, another may be able to pick up the slack temporarily.
Performance and complexity - Performance will depend on how your services work and how they are interconnected. Simple, straightforward applications won’t benefit much from being run on multiple machines. In fact, it may degrade its quality. Sometimes it’s better to leave the application as is and upgrade the hardware to meet demand.

How will I investigate the possibility of using scaling?

Analyse the current system: Analyse the current system to understand its limitations and potential for scaling.
Review the system architecture, monitor system performance, and try to identify any bottleneck or issues present in the system.
Determine scaling requirements: Analysing expected traffic volume, peak usage times, and user behaviour can help us determine the system's needs for additional resources or servers.
Evaluate scaling solutions: Involves considering both vertical and horizontal scaling.
Plan and implement a scaling solution: Either add more resources or servers to an existing infrastructure, or optimize the system for better performance. Choose accordingly after identifying scaling solution.
Test and monitor the system: Test and monitor the system to ensure that it's performing as expected. Ensure there are no bottlenecks and system is operable at high traffic.

Load Balancers

An essential component in modern web architecture that help distribute traffic evenly across multiple servers. They are often used to improve the availability and performance of applications by spreading out the workload and reducing the risk of server overload.
Helps in improving the efficiency of servers by optimizing traffic distribution, monitoring server health, and reducing downtime.
It can be a physical device or a virtualized instance running as a software process.

Investigating whether a load balancer is required or not

Analyse traffic patterns: Analyse the traffic patterns for the project. Monitoring web server logs or use a traffic monitoring tool to identify any patterns in traffic volume or usage.
Identify bottlenecks: Bottlenecks can limit the performance of the system. Review system logs or use performance monitoring tools to identify any issues or inefficiencies.
Consider high availability: If a project requires availability to meet demands of clients at any instant, then a load balancer is a must. Distributing traffic across multiple servers can help the project.
Determine scalability requirements: If the project is expected to grow in traffic volume or usage over time, a load balancer can ensure that the system can handle increased load by distributing traffic across multiple servers.
Evaluate load balancing solutions: Considering both hardware and software load balancing solutions, as per the requirements.
Plan and implement load balancing solution: Configure the load balancer, optimize the system for better performance.
Test and monitor the system: Test and monitor the system to ensure that it's performing as expected. This may involve load testing to simulate high traffic or usage times and monitoring the system for any issues or bottlenecks.