A Step-by-Step Guide to AWS EC2 Auto Scaling

Amazon Web Services (AWS) Elastic Compute Cloud (EC2) is a powerful and flexible cloud computing service that allows you to run virtual machines in the cloud. However, managing the capacity of your EC2 instances manually can be a daunting and time-consuming task. This is where AWS EC2 Auto Scaling comes into play. In this step-by-step guide, we will explore the concept of auto scaling and show you how to set up and configure AWS EC2 Auto Scaling to ensure that your application can handle varying workloads efficiently.

What is AWS EC2 Auto Scaling?

AWS EC2 Auto Scaling is a feature that allows you to automatically adjust the number of EC2 instances in your fleet to match your desired capacity. It is designed to help you maintain high availability and cost efficiency by automatically adding or removing instances based on the defined scaling policies and conditions. This ensures that your application can handle varying levels of traffic without manual intervention.

Prerequisites

Before you begin, make sure you have an AWS account and are logged in to the AWS Management Console.

Sign in to AWS Console

To get started with AWS EC2 Auto Scaling, sign in to your AWS Management Console. If you don't have an AWS account, you can sign up for one here.

Step 1: Create a Virtual Private Cloud (VPC)

Go to the AWS Management Console.
Navigate to the VPC service.
Click on "Create VPC."
Provide a name for your VPC and specify the IP address range.
Click "Create VPC."

Step 2: Create Two Subnets

In the VPC service, go to "Subnets."
Click "Create Subnet."
Choose your VPC.
Specify a name for the subnet.
Select the availability zone (e.g., us-east-1a).
Set the IP range for the subnet.
Click "Create Subnet."
Repeat the process for the second subnet (e.g., us-east-1b).

Step 3: Create an Internet Gateway

In the VPC service, go to "Internet Gateways."
Click "Create Internet Gateway."
Provide a name for the internet gateway.
Click "Create Internet Gateway."
Select the newly created internet gateway and attach it to your VPC.

Step 3.1: Create a Route Table

In the VPC service, go to "Route Tables."
Click "Create Route Table."
Provide a name for the route table.
Choose your VPC.
Click "Create Route Table."

Step 3.2: Associate Subnets with Route Table

Select the newly created route table.
In the "Subnet Associations" tab, click "Edit subnet associations."
Associate both of your subnets with the route table.

Step 3.3: Edit Route with Destination

In the "Routes" tab of the route table, click "Edit routes."
Add a route with a destination of "0.0.0.0/0" and select the internet gateway you created.
Click "Save routes."

Step 4: Create a Target Group

Go to the EC2 service.
Under the "Load Balancing" section, select "Target Groups."
Click "Create target group."
Provide a name for the target group.
Specify the protocol and port your application uses.
Choose your VPC.
Configure health checks as needed.
Click "Create."

Step 5: Create an Application Load Balancer

In the EC2 service, under the "Load Balancing" section, select "Load Balancers."
Click "Create Load Balancer."
Choose "Application Load Balancer."
Configure listeners and routing as necessary.
Select the two subnets you created.
Configure security groups to allow HTTP traffic and open it to the world.
Attach the existing target group.
Click "Create."

Step 6: Create Security Groups

In the EC2 service, navigate to "Security Groups."
Create two security groups: one for the Application Load Balancer and another for the EC2 instances launched by Auto Scaling.
Allow HTTP (port 80) for the Application Load Balancer security group and allow it to be accessed from anywhere.
For the EC2 instances, allow HTTP (port 80) and SSH (port 22) access, and make them accessible to everyone.

Step 7: Create a Launch Template

In the EC2 service, go to "Launch Templates."
Click "Create launch template."
Select the Amazon Machine Image (AMI) you want to use (e.g., RHEL 9).
Configure the instance type (e.g., t2.micro).
Ensure that you enable "Auto-assign Public IP."
Attach the security group you created for EC2 instances.
Add your SSH key pair.
In the user data section, add any necessary startup scripts or configurations.

Step 7.1: Create an Auto Scaling Group

In the EC2 service, under the "Auto Scaling" section, select "Auto Scaling Groups."
Click "Create Auto Scaling group."
Choose the launch template you created.
Configure network settings by selecting the two subnets you created.
Attach the existing load balancer and target group.
Enable health checks and monitoring.
Set the desired, minimum, and maximum group sizes (e.g., 2, 1, 3).
Leave scaling policies as "none" for this basic setup.
Click through the remaining options and create the auto scaling group.

Step 8: Test and Monitor

After configuring auto scaling, it's crucial to thoroughly test it to ensure that it behaves as expected. You can simulate traffic spikes or resource failures to see if the auto scaling group responds correctly.

Additionally, use AWS CloudWatch to monitor your instances and scaling activities. Create custom CloudWatch alarms to trigger actions based on specific metrics.

Fine-Tune and Optimise

Once your auto scaling group is up and running, periodically review your scaling policies and configurations to optimize cost and performance. Adjust the scaling thresholds, instance types, and other settings as needed based on your application's usage patterns.

"For your convenience, I have prepared a Terraform script that fulfills the aforementioned requirements. This script orchestrates the setup of an AWS infrastructure, incorporating essential components like a VPC, subnets, an internet gateway, security groups, load balancers, launch templates, and an auto-scaling group for EC2 instances."

Here is Git repo: https://github.com/yadanaresh/aws-ec2-autoscale.git

Conclusion

AWS EC2 Auto Scaling is a powerful tool for managing the capacity of your EC2 instances automatically. By following this step-by-step guide, you can set up and configure auto scaling to ensure your application can handle varying workloads efficiently, improve availability, and optimize costs. Remember that the key to successful auto scaling is proper planning and monitoring to align your application's performance with your business needs.