DEV Community

Cover image for How to Create Auto Scaling Groups of EC2 Instances for High Availability
Taiwo Akinbolaji
Taiwo Akinbolaji

Posted on

How to Create Auto Scaling Groups of EC2 Instances for High Availability

INTRODUCTION

If you’ve ever launched an EC2 instance on AWS, you were already working inside a Virtual Private Cloud (VPC). A VPC is essentially the virtual network that controls all your networking activities. Its setup determines how different parts of your infrastructure communicate with each other and how they access the public internet.

In this project, the goal is to build an environment where an auto-scaling group automatically handles the provisioning and termination of EC2 instances, while elastic load balancers evenly distribute incoming traffic across those instances.

DEFINITION OF TERMS

VPC:
A Virtual Private Cloud (VPC) is essentially a private section of the cloud that exists within a larger public cloud environment. It creates a dedicated, logically isolated space where your resources can operate securely, giving you greater control over networking and access.

Subnets:
A subnet, or subnetwork, is a smaller network carved out of a larger one. Subnetting helps improve network performance and organization. By dividing a network into subnets, data can move more efficiently since it doesn’t have to travel through unnecessary routers to reach its destination. Subnets are a fundamental part of setting up a VPC.

Internet Gateway:
An Internet Gateway is the component that enables communication between your AWS VPC and the public internet. It serves as the entry and exit point for internet-bound traffic.

Load Balancer:
An Application Load Balancer distributes incoming HTTP and HTTPS traffic across multiple targets—such as EC2 instances, containers, or microservices. It evaluates each incoming request using a set of prioritized listener rules and then directs the traffic to the appropriate target group based on those rules.

Route Table:
A Route Table contains routing rules that determine how network traffic flows within the VPC. These rules guide traffic coming from subnets or from the internet gateway to their intended destinations.

Launch Template:
A Launch Template stores predefined configuration settings used to create AWS resources like EC2 instances. It ensures consistency and simplifies the process of launching new instances.

High Availability:
High availability refers to a system’s capability to remain operational for as long as required, with minimal downtime. It focuses on eliminating single points of failure so that applications continue running even if one of the underlying components, such as a server, experiences a failure.

USE CASES

Being able to scale a web application and evenly distribute incoming traffic across multiple instances is crucial for maintaining high availability in modern applications. In this guide, we’ll demonstrate how AWS Application Load Balancers and Auto Scaling Groups can be used to achieve this.

An Auto Scaling Group automatically provisions and terminates EC2 instances based on demand, while an Elastic Load Balancer ensures that incoming requests are efficiently routed across all active instances.

Step 1: Setup VPC

On the AWS console, go to the VPC dashboard and select Create VPC to set up a new VPC named project7vpc, as illustrated below.

Step 2: Create three Public Subnets With 10.10.1.0/24 & 10.10.2.0/24 & 10.10.3.0/24

i. Click subnets from left hand panel of AWS page

ii. Click create subnet

iii. On the VCP ID select our VCP which we initially created i.e project7vpc

iv. On the subnet name input project7subnet1

v. Input the IPV4 CIDR block as 10.10.1.0/24 and click create subnet to create our first subnet. Let’s Name it project7subnet1

Click add subnet to create subnets project7subnet2 and project7subnet3 with CIDR blocks 10.10.2.0/24 & 10.10.3.0/24 respectively.

Note: A VPC is completely private by default. While resources within the same VPC—such as subnets—can communicate with one another, they cannot access other VPCs or the public internet for security reasons. To make our subnets publicly accessible and allow them to reach the internet, we must attach an Internet Gateway to the VPC and associate it with the subnets.

Step3a: Connect Subnets To Internet Gateway

(i) Create an internet gateway

(ii) Attach the gateway to the VPC

(iii) Create a routable and then

(iv) Create a route for the gateway

(v) Attach the route table to the public subnets

(vi) Create An Internet Gateway

From the left navigation pane in the AWS console, select Internet Gateways and then choose Create internet gateway.

Enter a name in the Name tag field—let’s use project7igw—and proceed to create it.

(b) Now to attach the gateway to the VPC

Click VPCs and the internet gateway from AWS console and then click VPC

Select name of internet gateway. We are going to name it project7igw

Click action and then select attach VPC

Select the custom VPC on the available VPC which we created earlier and then click attach internet gateway.

(c) Create A Routable

  • Click VPC and then click route table and then click Create a new route table.
  • Give the routable a name. Let’s call it project7rtb and also select the VPC that we want to attach and then click create route table

(d) Create A route for the Gateway

On the routable we just created,

  • Select edit route
  • Select add route to route it to the public internet
  • and give 0.0.0.0/0 and target as the gateway we created earlier and save the changes

(e) Attach The Route Table To The Public subnets

  • Go to subnet and then to public subnet
  • select each of the 3 subnets
  • Go to the route table tab and click action and then edit rout table association.
  • Then change the route table ID to the route we created

Step 4: Create An Autoscaling Group Using The t2.micro Instances

Next, we’ll proceed to create an Auto Scaling Group using t2.micro instances. Make sure you have your Apache installation script ready as well.

Go to the EC2 Dashboard, and in the left-hand menu under the Auto Scaling section, select Auto Scaling Groups to begin the setup.

Click on Create Auto Scaling group to begin. The Auto Scaling Group controls how many EC2 instances should be running at any given time and manages all the automatic scaling operations for the environment.

Next, we need to create a Launch Template. This template contains all the necessary configuration details for launching our instances, such as the AMI, security group, and other settings. We will use this Launch Template to create our Auto Scaling Group, which will then scale instances up or down as needed based on the template. Click Create Launch Template to proceed, as shown below.

Now, fill in the Launch Template details with the required names and configuration. Choose the Amazon Machine Image (AMI) for your instances—either an Amazon Linux AMI or an Ubuntu AMI works well for installing Apache. Make sure to set the instance type to t2.micro.

Staying consistent with the past projects let’s choose Amazon Linux AWS.

We are going to use the free tier t2.micro instance.

To verify that our instances are public and accessible, we’ll need to SSH into them. Use your existing key pair, or create a new one if you prefer.

In the networking settings section, select the following:

Next, we need to configure the security group to allow incoming traffic. Add a rule to permit internet traffic on port 8080, and also include port 22 so we can SSH into the instances. Click Add Security Group Rule to set this up.

And Now we select the drop-down arrow for the ‘Advanced network configuration’

We are going to want to enable the auto-assign public IP.

Note: When hosting a web application or service on an EC2 instance in the cloud, you generally want it to be reachable from the public internet. Enabling auto-assign public IP automatically gives the instance a public IP address, making it accessible online without any extra setup.

Now we can scroll all the way down and expand on Advanced details. Locate user data, so we can input our bash script.

Successful!

That completes our creating of the EC2 instances that will be launched, moving forward we will begin to create our Load Balancer

Step 5: Next, We Need to Create Our Load Balancer

Select Attach to a new load balancer. As shown below

We can now fill up the information as shown below as our auto scaling group is automatically selected.

Note: An Application Load Balancer (ALB) is ideal for managing HTTP/HTTPS traffic and offers application-level routing features, whereas a Network Load Balancer (NLB) is better for TCP/UDP traffic and provides high-performance load balancing.

In the Availability Zones and Subnets section, create a target group. You can leave most settings at their default values, but make sure to specify a name for the target group. Once the target group is created, return to the Load Balancer tab, refresh the list, and select the newly created group.

Step 6: Launch and configure application load balancer
Configure network mappings

Go to the EC2 Dashboard, and in the left-hand menu, scroll down and select Load Balancers, then click Create Load Balancer.

Choose to create an Application Load Balancer, give it a name, and keep the remaining basic settings at their default values. Proceed to the Network Mappings section, select your VPC, and then choose the three Availability Zones along with their corresponding subnets, as illustrated below.

We give a name to our Load balancer, let’s call it “project7loadbalancer”

On the network mapping we select our VPC ie “project7vpc”. See below

Step 7: Create Web Server Security group

Next, move on to the Security Group settings and create a new security group. Give the security group a name and ensure it’s associated with the VPC you created earlier. Add an inbound rule to allow HTTP traffic from Anywhere (0.0.0.0/0), as shown below.

Also, add another rule of type SSH with the source set to Anywhere.

Note: This poses a security risk, but for demonstration purposes, we’ll allow it in this example.

Afterward, click Create Security Group. Then, select the newly created security group from the list, as

Scroll to the bottom, review the summary then “Create load balancer”

Now our load balancer is up and running as shown below.

Step 8: Create An Autoscaling Group (ASG) Using The t2.micro Instances
Configure new ASG launch options

From the Launch Template, click Create Auto Scaling Group, as shown below. Then, in the left-hand menu, scroll down, select Auto Scaling Groups, and click Create Auto Scaling Group.

Click Next and on the next page we can select our VCP and the subnets we created earlier. Hit next when finished.

Now we can an ASG along with the required load balancer.

Note: An Application Load Balancer (ALB) is ideal for managing HTTP/HTTPS traffic and offers routing at the application level, while a Network Load Balancer (NLB) is better suited for TCP/UDP traffic and provides high-performance load balancing.

In the Availability Zones and Subnets section, create a target group. You can leave most settings at their default values, but be sure to give the target group a name. After creating it, return to the Load Balancer tab, refresh the list, and select the newly created group.

Configure ASG Group size and CloudWatch Monitoring
For our use case, set the desired and minimum capacity to 2, and the maximum capacity to 5. Choose Target Scaling Policy, ensure the metric type is Average CPU Utilization, and set the target value to 50. Then click Next to proceed.

Proceed through the setup until you reach the Review page. Check all the final configurations, then click Create Auto Scaling Group. You should see your ASG displaying a status of “updating capacity…” as it launches EC2 instances based on your predefined settings.

Next, go to the EC2 Dashboard to confirm that two EC2 instances have been created and are running under your Auto Scaling Group.

Step 9: Connect to Servers running Apache Web Server

Get the public IP address of each EC2 instance from the Networking tab in the Amazon EC2 dashboard. Copy and paste the IP address into your browser’s address bar. You should see the default Apache Web Server webpage displayed, as shown below.

Congratulations!

We have done an excellent job! We’ve successfully set up an infrastructure where an Auto Scaling Group automatically handles the creation and termination of EC2 instances, while Elastic Load Balancers efficiently distribute network traffic across those instances.

ADVANCED

Stress testing our Auto Scaling group

In this section, we’ll demonstrate how to test and verify that our Auto Scaling Group can maintain high availability by automatically scaling EC2 instances under stress. We’ll use CPU utilization exceeding 50% as the trigger for scaling.

Step 1: SSH into EC2 Instance And run stress command

In this step, we’ll connect to the instance via the command line using SSH, run a stress command to put load on the system, and observe how it handles the increased demand.

Once connected, run the following command to add stress —

Step 2: Review CloudWatch Alarms
Navigate to CloudWatch alarms. Select “All alarms” in the left pane.

You should now see our ASG alarm in the “In alarm” state, as shown below.

For a closer look, click on the alarm to view the CPU utilization. As shown below, it increased from 0.234% up to approximately 83.9%, well above our 50% threshold.

Step 3: Review EC2 Instances

When we check our running EC2 instances, we can see that the Auto Scaling Group launched an additional instance to handle the increased load. This helped reduce the CPU utilization back below our 50% threshold, as shown below.

GREAT RESULT!

We have just stress-tested our infrastructure to demonstrate high availability. When the load exceeds our 50% threshold, the Auto Scaling Group automatically responds by launching additional EC2 instances to handle the increased demand.

If you have followed along this far, thank you! I hope you found it helpful.

**Reminder: **Don’t forget to clean up your environment by deleting all the resources you created and configured to avoid unnecessary charges.

Top comments (0)