In a previous article in which I wrote about EC2 auto scaling, I failed to talked about instance lifecycle hooks and how AWS practitioners can utilize them to optimize their infrastructure. This article is my way of showing you that I have learned from that mistake.
A little recap of what auto scaling is: It's a procedure or mechanism that helps you automatically (as the "auto" in auto scaling suggests) increase or decrease the size of your IT resources based on predefined thresholds and metrics. In the context of AWS, there is EC2 auto scaling and a service called AWS Auto Scaling, which is used for scaling ECS, DynamoDB, and Aurora resources. However, the focus of this article is on EC2 auto scaling and how to effectively leverage lifecycle hooks during scaling.
Before I move on with this article, I give you a real-world example of why auto scaling is important to get you to continue reading this article with an increased level of attention.
Imagine a popular social media app. Every Sunday evening, after a weekend filled with adventures, users rush to the app to upload and share their photos. Without auto scaling, the app's servers would be overwhelmed during this rush, causing slow loading times or even crashes. However, with auto scaling in place, the app can automatically scale up by launching additional EC2 instances to handle the increased traffic. This ensures a smooth user experience even during peak times, leading to greater customer satisfaction and retention. But auto scaling doesn't stop there. Once the Sunday rush subsides, auto scaling can intelligently scale back in, terminating unused instances. This frees up valuable resources and reduces costs. This automatic provisioning and de-provisioning not only saves money, but also frees up the IT professionals who would otherwise be manually managing server capacity (a very tedious task).
Now that you are sold on the importance of auto scaling, let's move on to the other parts of this article.
Lifecycle Hooks
Any frontend developer who has used a library like React.js already has an understanding of what a lifecycle hook is. The concept is similar in the context of EC2 instances on AWS. Lifecycle hooks give you the ability to perform custom actions on instances in an Auto Scaling group from the time they are launched through to their termination. They provide a specified amount of time (one hour by default) to wait for the action to complete before the instance transitions to the next state. Let's talk about the different stages in the lifecycle of an EC2 instance during scaling.
When an EC2 instance is launched during a scale out event, it enters a pending state allowing time for the instance to run any bootstrapping scripts specified in the user data section of the launch configuration or template of the Auto Scaling group. Once all this is complete, the instance immediately goes into service that is the running state. On the flip side of things, when an instance is being removed from an Auto Scaling group during a scale in event or because it has failed health checks, it moves to the terminating or shutting-down state until it finally enters the terminated state. Even though this looks like a pretty robust set up, it can constitute some problems. For example, when an instance is launched and the user data script has finished running and the instance enters the in-service (running) state, it doesn't necessarily mean the application to be served by the instance is ready to start receiving and processing requests because it might still need more time to perform tasks such as processing configuration files, loading custom resources or connecting to backend databases amongst others. While all this is still trying to complete, the instance might already be receiving health check requests from a load balancer. What do you think will the result of the health check when this happens? You are right if your answer to that question is that the health checks will likely fail because the application is still loading. How then do we inform an auto scaling group that an instance that has been launched is not ready to start receiving any type of requests yet and needs more time before it is ready to start receiving requests? We will come back to this question in a minute.
There is another pertinent problem. During a scale-in event, an instance scheduled for termination may still be in the middle of processing requests and may even contain some important logs needed for troubleshooting issues in the future. If the instance is suddenly terminated, both the in-progress requests and logs will be lost. How do you tell your auto scaling group to delay the termination of the instance until it has finished processing pending requests and important log files have been collected into a permanent storage service like Amazon S3? The answer to this question, and the one asked a couple of sentences ago, is, as you might have guessed, lifecycle hooks.
Using an instance launching lifecycle hook, you can prevent an instance from moving from the pending state straight into service by first moving it into the pending:wait state to ensure the application on the instance can finish loading and is ready to start processing requests. When that event ends, the instance moves to the pending:proceed state where the Auto Scaling group can then attempt to put it in service (running state)
In a similar manner, you can also make use of instance termination on the flip side of things that is, when an instance is targeted for termination, an instance terminating lifecycle hook will put your instance in a terminating:wait state. During which you can do your final cleanup tasks such as preserving copies of logs by moving them to S3 for example. And once you're done, or a preset timer (one hour by default) expires, the instance will move to terminating:proceed state, and then the Auto Scaling group will take over and proceed to terminate the instance.
There are many other use cases for lifecycle hooks, such as managing configurations with tools like Chef or Puppet, among others. We won't go into the details of these to avoid making this article too long. Before I conclude this article, let's look at some implementation considerations for lifecycle hooks.
Implementation Considerations for Lifecycle Hooks
Before making use of lifecycle hooks you should always consider factors such as:
Timeout — The default timeout for a lifecycle hook as I have already mentioned is one hour (3600 seconds). This may be sufficient for most initialization or cleanup tasks. You can set a custom timeout duration based on your specific needs. The timeout should be long enough to complete necessary actions but not so long that it delays scaling operations unnecessarily.
Action Success/Failure — You have to clearly define what constitutes a successful completion of the lifecycle hook action. This might include successful software installation, configuration setup, or data backup. You will also need to identify conditions that would result in a failure, such as timeout expiration, script errors, or failed installations. In a similar fashion, you should configure your system to send notifications (e.g., via SNS or CloudWatch) upon completion of lifecycle hook actions. This helps in tracking and auditing.
Always keep in mind that lifecycle hooks can add latency to scaling events so you should optimize all actions for efficiency.
Final Thoughts
In this article, we explored the concept of EC2 auto scaling and then looked at lifecycle hooks, illustrating how they enhance the efficiency of Auto Scaling groups. We also discussed key implementation considerations to ensure the effective use of lifecycle hooks in your scaling strategy. By combining auto scaling with lifecycle hooks, you gain a powerful and automated approach to managing your cloud infrastructure. Auto scaling ensures your application has the resources it needs to handle fluctuating demands, while lifecycle hooks provide the control to tailor instance behavior during launch and termination. This gives you the ability to optimize resource utilization, streamline deployments, and ultimately deliver a highly available and scalable application experience. Thank you for taking the time to read this and learn more about EC2 Auto Scaling with me.
Top comments (1)
Can you enble ELB Health Checks from Auto Scaling Group console to address below point? Thanks!
"For example, when an instance is launched and the user data script has finished running and the instance enters the in-service (running) state, it doesn't necessarily mean the application to be served by the instance is ready to start receiving and processing requests because it might still need more time to perform tasks such as processing configuration files, loading custom resources or connecting to backend databases amongst others. While all this is still trying to complete, the instance might already be receiving health check requests from a load balancer. What do you think will the result of the health check when this happens? You are right if your answer to that question is that the health checks will likely fail because the application is still loading. How then do we inform an auto scaling group that an instance that has been launched is not ready to start receiving any type of requests yet and needs more time before it is ready to start receiving requests?"