There is a common myth amongst businesses that performance testing and eventually scaling up systems according to the needs is all about enhancing the physical parameters of the infrastructure without being cognisant of many other factors. Trust me that’s not the case. There are many factors attached while applying an architectural scale up strategy to any system in today’s day and age. Merely scaling up physical infrastructure may give you a sense of security for a while, but in long run that is not sustainable enough unless we consider lot of other factors that are easily missed otherwise.
For past few years I have been closely performing tasks for performance enhancements on different projects with different scopes and varied complexity/scale. Here is the quick wrap up of some key considerations that I feel might be helpful while planning to scale up systems.
Anticipation of the expected load
Something that normally businesses tend to overlook is that apparently there are lot of limitations in terms of resources, cost, etc. due to which one cannot optimise a system to bare the load of whole universe coming at the same point in time. There are lot of factors that we need to consider before carrying out the activity and anticipation of approximate load is of prime importance. It doesn’t mean that the business has to give the accurate figure, but the tech team requires approximate load that is expected for the system usage.
This helps in scaling up the system considering the cost that will be incurred keeping in mind the ROI, which can be crucial for the business.
Tool to fire load organically on the system
After anticipation of an approximate load that is expected on the system one need to get hold of a tool to mimic organic load to the system. There are lot of tools available to carry out load testing on the system, my personal favourite being JMeter (of course that’s open source).
Gradually increasing test load on the system
I know jumping to conclusion is an impulsive human nature, but this is where we should hold our horses. We should start a gradual process, starting with an initial load and thereby increasing it while monitoring how the system responds under different load conditions. This is where the patience of the team is tested, because when we start increasing the load things will start to fall apart and the system will start behaving in an unexpected way, and this where next points will come handy.
Performance monitoring tool
One important tool that we require here is something that gives an insight about the current system resources and how the system is managing parameters like (just to name a few):
a. CPU utilisation
b. Memory utilisation
c. Number of concurrent requests being processed in the system
d. DB connections
Lot of COTS solutions are available for the purpose, that can be utilised to monitor, again my personal favourite being New Relic.
Error monitoring tool
Of course, no one wants to strain their eyes by sticking them on logs, for this reason it is of prime importance to invest in a tool that monitors exceptions and error conditions arising in system. Plus this is something that will come handy not just during load testing but also during normal operations of the business and allows tracking and regression for issues.
This is a two-step process normally this is where tech team tends to make a mistake and straight away jump to step 2. May be to accelerate the process or sometimes because of stringent timelines, but anyways this is where we need to calm ourselves and make a sage decision to invest some time and divide the optimisation process at two levels.
Code Level Optimisation & refactoring
Code level optimisation should be the first level that we should start with as this is where most of the benchmark is achieved. On this step the goal should be to make judgements about the most consumed parts of the system i.e. any piece of code or API that is being highly used/executed during the operations should be optimised and the output should be an efficient piece of code. It may require code refactoring or may be some alterations in the usage of external plugin if there are any.
Infrastructure Level Optimisations
After completely optimising and building a robust and efficient code base now is the time to jump on tweaking infrastructural level parameters. This requires tweaks in number of clusters, number of CPUs or the amount of Memory required. Further it also includes enhancements at DB connections threads per CPU or number of concurrent requests to be supported by the system.
On-demand Cloud Infrastructure (Optional)
Last thing which I personally recommend highly during these times where business is all about managing cost efficiently and effectively, is to use an on-demand cloud infrastructure. This will really help boost up an efficient infrastructure that is also cost effective at the same time.
Top comments (0)