Michael Mekuleyi

Posted on Mar 25, 2023 • Edited on Jul 4, 2023

Designing a Production grade Elastic Load Balancer Service in AWS with Terraform

#devops #terraform #aws

Introduction

In this article, we will be focusing on designing an Elastic load balancer with an Auto-scaling group in terraform using the AWS Cloud service. We will build this load balancer service with a modular architecture, dividing the components into two modules, mainly a Network module and an Auto-scaling group module. We will also configure automatic scale-in and scale-out for the Autoscaling group based on peak business hours and incoming traffic. Finally, we will set a cloud watch alarm that will ring up an alert when the CPU credits on a free-tier service are below a threshold.

In this article, I will be assuming you have a properly setup terraform configuration, with an IAM user that with AWSEC2FullAAccess, I advise that you use the local terraform backend, but if you want to use a remote backend service with AWS S3, I have set up an optional configuration that could help you get started quickly. I am also assuming that you have a basic understanding of terraform and AWS components, this is neither a beginner article nor an intro to terraform. Finally, we will only go over the most important aspects of the configuration as it is not possible to cover all possible components defined in each module. Feel free to check the codebase of this article here.

File Structure

The project is broken down into a very simple file structure that breaks down every single step of the project into code.

First, there is an optional global service that deploys an S3 bucket and a DynamoDB table, this is just in case you want to deploy a remote S3 backend service, it is completely optional. Next, we have a modules folder where we store all our modular components, namely the AutoScaling Group with a rolling update, and the networking module, we will dive deep into these in the next sections. Finally, we have the services folder, here we deploy our app which is designed with both the AutoScaling group module and the networking module with additional terraform objects that just make deploying simpler and easier.

Networking Module

The networking module is in charge of provisioning the load balancer, the load balancer listener, the security group, and the security group rules that accompany the security group. The files are divided into the following components.

main.tf: This is where the main configuration for the deployment is defined, we define the terraform version and we configure the load balancer network here.
variables.tf: This is where all the variables that are used in the configuration are stored.
outputs.tf: This defines the outputs of the configuration

First thing we do in modules/networking/main.tf is to define the terraform version we hope to use and then also set a lower bound for the version of aws that will be in use



terraform {
  required_version = ">= 1.0.0, < 2.0.0"
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 4.0"
    }
  }
}

The next block of code will define the Elastic load balancer and the Elastic load balancer listener, Here we set the load balancer type to an Application load balancer, configure the load balancer name and then go ahead to set the elastic load balancer to listen to port 80 with the load balancer listener rule.



resource "aws_lb" "example" {
  name               = var.alb_name
  load_balancer_type = "application"
  subnets            = var.subnet_ids
  security_groups    = [aws_security_group.alb.id]
}
resource "aws_lb_listener" "http" {
  load_balancer_arn = aws_lb.example.arn
  port              = local.http_port
  protocol          = "HTTP"
  # By default, return a simple 404 page
  default_action {
    type = "fixed-response"
    fixed_response {
      content_type = "text/plain"
      message_body = "404: page not found"
      status_code  = 404
    }
  }
}

At this juncture, I must introduce the local object in terraform, this object is used to define the http port. The reason we reference the local object is because local allows us to configure variables that are static, preset, and do not change at runtime. You can find its definition below,



locals {
  http_port    = 80
  any_port     = 0
  any_protocol = "-1"
  tcp_protocol = "tcp"
  all_ips      = ["0.0.0.0/0"]
}

Finally in the modules/networking/main.tf , we will define the security group, link it to the ELB, and then define the ingress and egress rules as additional objects and finally attach them to the pre-defined security group.



resource "aws_security_group" "alb" {
  name = var.alb_name
}
resource "aws_security_group_rule" "allow_http_inbound" {
  type              = "ingress"
  security_group_id = aws_security_group.alb.id
  from_port   = local.http_port
  to_port     = local.http_port
  protocol    = local.tcp_protocol
  cidr_blocks = local.all_ips
}
resource "aws_security_group_rule" "allow_all_outbound" {
  type              = "egress"
  security_group_id = aws_security_group.alb.id
  from_port   = local.any_port
  to_port     = local.any_port
  protocol    = local.any_protocol
  cidr_blocks = local.all_ips
}

Here we are allowing ingress to port 80 from anywhere while allowing the elb service to access the internet from any port.

In modules/networking/outputs.tfwe are returning the dns name, the http listener arn and security group id,



output "alb_dns_name" {
  value       = aws_lb.example.dns_name
  description = "The domain name of the load balancer"
}
output "alb_http_listener_arn" {
  value       = aws_lb_listener.http.arn
  description = "The ARN of the HTTP listener"
}
output "alb_security_group_id" {
  value       = aws_security_group.alb.id
  description = "The ALB Security Group ID"
}

In modules/networking/variables.tf, the only variables we would need would be the name of the Load balancer and the subnet-ids, which we will provide later.



variable "alb_name" {
  description = "The name to use for this ALB"
  type        = string
}
variable "subnet_ids" {
  description = "The subnet IDs to deploy to"
  type        = list(string)
}

Let's move on to defining the auto-scaling group in the ASG module.

AutoScaling Group Module

In this module, we will define the compute parameters of the load balancer, we will go on to define components like the type of AMI to be used, the instance type, the Autoscaling group configuration, CloudWatch alarms, and the security group for the launch configuration.

In /modules/asg-rolling-update/main.tf, we define the launch configuration, which is a template of how each EC2 instance is formed. We define a security group, add a user data script to run on launch, and then we define a life-cycle condition.



resource "aws_launch_configuration" "example" {
  image_id        = var.ami
  instance_type   = var.instance_type
  security_groups = [aws_security_group.instance.id]
  user_data       = var.user_data
  # Required when using a launch configuration with an auto scaling group.
  lifecycle {
    create_before_destroy = true
    precondition {
      condition     = data.aws_ec2_instance_type.instance.free_tier_eligible
      error_message = "${var.instance_type} is not part of the AWS Free Tier!"
    }
  }
}

Notice that the life cycle method is set to create before destroy, this means that in the effect of a configuration that requires the recreation of instances, terraform would create the new instance first before destroying the old ones. We are also adding a precondition block, the condition here is that the instance type must be free-tier eligible, you can remove this precondition block if you want to set up instances that are outside the free-tier zone.

Next, we go on to define our AutoScaling group,



resource "aws_autoscaling_group" "example" {
  name                 = var.cluster_name
  launch_configuration = aws_launch_configuration.example.name
  vpc_zone_identifier  = var.subnet_ids
  # Configure integrations with a load balancer
  target_group_arns    = var.target_group_arns
  health_check_type    = var.health_check_type
  min_size = var.min_size
  max_size = var.max_size

  # Use instance refresh to roll out changes to the ASG
  instance_refresh {
    strategy = "Rolling"
    preferences {
      min_healthy_percentage = 50
    }
  }

  tag {
    key                 = "Name"
    value               = var.cluster_name
    propagate_at_launch = true
  }
  dynamic "tag" {
    for_each = {
      for key, value in var.custom_tags:
      key => upper(value)
      if key != "Name"
    }
    content {
      key                 = tag.key
      value               = tag.value
      propagate_at_launch = true
    }
  }
  lifecycle {
    postcondition {
      condition     = length(self.availability_zones) > 1
      error_message = "You must use more than one AZ for high availability!"
    }
  }
}

We define the cluster name and link the Auto-scaling group to the predefined launch configuration. We set up the health check type, the target group ARNs, and also the minimum and maximum size of the auto-scaling group. I want to add here that you should not bother about most of the variables defined here, we will use the data object to define them later in the app service. We also define a dynamic tag, this will generate the tags for the auto-scaling group from a map of key, value pairs. This is particularly useful if you have a long list of tags you want to define. We also ensure to initiate the Elastic load balancer in more than one Availability Zone, this is to guarantee maximum availability.

Next, we define an auto-scaling schedule,



resource "aws_autoscaling_schedule" "scale_out_during_business_hours" {
  count = var.enable_autoscaling ? 1 : 0
  scheduled_action_name  = "${var.cluster_name}-scale-out-during-business-hours"
  min_size               = 2
  max_size               = 10
  desired_capacity       = 10
  recurrence             = "0 9 * * *"
  autoscaling_group_name = aws_autoscaling_group.example.name
}
resource "aws_autoscaling_schedule" "scale_in_at_night" {
  count = var.enable_autoscaling ? 1 : 0
  scheduled_action_name  = "${var.cluster_name}-scale-in-at-night"
  min_size               = 2
  max_size               = 10
  desired_capacity       = 2
  recurrence             = "0 17 * * *"
  autoscaling_group_name = aws_autoscaling_group.example.name
}

I am assuming this load balancer service has its peak hours from 9 am to 5 pm, hence we are instructing the auto-scaling group to scale out by 9 am and then scale inwards by 5pm, feel free to remove this block or to edit the desired capacity during peak hours and relaxed hours.

Finally, we define our cloud watch alarms,



resource "aws_cloudwatch_metric_alarm" "high_cpu_utilization" {
  alarm_name  = "${var.cluster_name}-high-cpu-utilization"
  namespace   = "AWS/EC2"
  metric_name = "CPUUtilization"
  dimensions = {
    AutoScalingGroupName = aws_autoscaling_group.example.name
  }
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = 1
  period              = 300
  statistic           = "Average"
  threshold           = 90
  unit                = "Percent"
}
resource "aws_cloudwatch_metric_alarm" "low_cpu_credit_balance" {
  count = format("%.1s", var.instance_type) == "t" ? 1 : 0
  alarm_name  = "${var.cluster_name}-low-cpu-credit-balance"
  namespace   = "AWS/EC2"
  metric_name = "CPUCreditBalance"
  dimensions = {
    AutoScalingGroupName = aws_autoscaling_group.example.name
  }
  comparison_operator = "LessThanThreshold"
  evaluation_periods  = 1
  period              = 300
  statistic           = "Minimum"
  threshold           = 10
  unit                = "Count"
}

The first alarm is to detect high CPU utilization on the Autoscaling group EC2 instances, the second alarm is to detect low CPU credit balance for the EC2 instances run on the free tier. In each case when each alarm is triggered the Autoscaling group performs a scale-out, to ensure that the entire load balancer continues to run and there are no downtimes. Finally, we use the data object to grab the EC2 instance types from AWS.



data "aws_ec2_instance_type" "instance" {
  instance_type = var.instance_type
}

Next, we define the outputs in the AutoScaling module in modules/asg-rolling-update/outputs.tf,



output "asg_name" {
  value       = aws_autoscaling_group.example.name
  description = "The name of the Auto Scaling Group"
}
output "instance_security_group_id" {
  value       = aws_security_group.instance.id
  description = "The ID of the EC2 Instance Security Group"
}

Here we will be printing the name of the AutoScalingGroup and the Instance security group ID. Next, we go on to define key variables for the ASG module. You can head over to the modules/asg-rolling-update/variables.tf to check the variables that are defined. I will be pointing out some important variables defined in modules/asg-rolling-update/variables.tf.



variable "instance_type" {
  description = "The type of EC2 Instances to run (e.g. t2.micro)"
  type        = string
  default = "t2.micro"
  validation {
    condition     = contains(["t2.micro", "t3.micro"], var.instance_type)
    error_message = "Only free tier is allowed: t2.micro | t3.micro."
  }
}

Here we are defining the instance type to have a default of 't2.micro' and while you are allowed to set another instance type, we have added a validation object on the variable, just to ensure that only free tier instances are allowed that is either t2.micro or t3.micro.

Secondly, we are defining the minimum number of instances and the maximum number of instances with additional validation,



variable "min_size" {
  description = "The minimum number of EC2 Instances in the ASG"
  type        = number
  default = 1
  validation {
    condition     = var.min_size > 0
    error_message = "ASGs can't be empty or we'll have an outage!"
  }
  validation {
    condition     = var.min_size <= 10
    error_message = "ASGs must have 10 or fewer instances to keep costs down."
  }
}

Here we are setting the minimum instance to a default of 1 while adding the rule that the minimum instance must not be less than zero however it must not be greater than 10 (feel free to edit this as your use case determines). We also do the same thing for maximum instances, feel free to check that out in the modules/asg-rolling-update/variables.tf file.

Hello World Service

In this section of the article, we are going to look at combining the NetworkModule and the AutoscalingGroup module to have a single service, the goal is to ensure that the modules are reusable and can be combined to form different services, thereby making them composable.

To get started we define the LoadBalancer TargetGroupin services/app/main.tf, this resource is individually different from the Load balancer module, as it is particularly useful if you want to target different load balancers.



resource "aws_lb_target_group" "asg" {
  name     = "hello-world-${var.environment}"
  port     = var.server_port
  protocol = "HTTP"
  vpc_id   = data.aws_vpc.default.id
  health_check {
    path                = "/"
    protocol            = "HTTP"
    matcher             = "200"
    interval            = 15
    timeout             = 3
    healthy_threshold   = 2
    unhealthy_threshold = 2
  }
}

We also define a load-balancer listener rule and set its target group to the pre-defined target group above,



resource "aws_lb_listener_rule" "asg" {
  listener_arn = module.alb.alb_http_listener_arn
  priority     = 100
  condition {
    path_pattern {
      values = ["*"]
    }
  }
  action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.asg.arn
  }
}

It is time to import the different modules and use them as services. First, we define the autoscaling group module and preset its variables,



module "asg" {
  source = "../../modules/asg-rolling-update"
  cluster_name  = "hello-world-${var.environment}"
  ami           = var.ami
  instance_type = var.instance_type
  user_data     = templatefile("${path.module}/user-data.sh", {
    server_port = var.server_port
    server_text = var.server_text
  })
  min_size           = var.min_size
  max_size           = var.max_size
  enable_autoscaling = var.enable_autoscaling
  subnet_ids        = data.aws_subnets.default.ids
  target_group_arns = [aws_lb_target_group.asg.arn]
  health_check_type = "ELB"

  custom_tags = var.custom_tags
}

Note that when you choose to extract a module and use it, you can override its default variables with new variables you pass into the module, you can also choose to leave the predefined variables in the module as default, this is entirely up to you. In the ASG module object, we also define user-data.sh.



#!/bin/bash
cat > index.html <<EOF
<h1>Hello World</h1>
<p>${server_text}</p>
EOF
nohup busybox httpd -f -p ${server_port} &

This file defines start-up instructions for each EC2 instance that is spun by the autoscaling group.

Next, we define the network module and pass in values to its declared variables,



module "alb" {
  source = "../../modules/networking/"
  alb_name   = "hello-world-${var.environment}"
  subnet_ids = data.aws_subnets.default.ids
}

In /services/app/variables.tf, we re-initialize predefined variables from AutoScalingGroup module and the Network module, this is so that we can customize their actions for this particular hello-world service.

Next, we will focus on the services/app/outputs.tf, here we will be printing out the most important value from the entire deployment which is the DNS name of the load balancer,



output "alb_dns_name" {
  value       = module.alb.alb_dns_name
  description = "The domain name of the load balancer"
}

That's it. Our setup is complete.

Deployment

To deploy and test our services, we head over to /services/app. Ensure that you are in the app directory, and make sure to also ensure that the necessary variables are properly filled so that you don't get an error.

We kick off the deployment by running terraform init , this will initialize our terraform modules and download the necessary files needed for deployment.

Next, we run a mandatory terraform plan to inspect and view the infrastructure to be deployed on AWS,

Next, we run terraform apply --auto-approve and deploy the configured infrastructure.

And there we have it, our resilient production-grade load balancer is deployed. Let's go over to the load balancer DNS name to check what is deployed.

Conclusion

In this article, we have successfully deployed a production-grade Elastic load balancer, with many tools in its armory like CloudWatch alarms and an automated scale-out based on peak business hours. Thank you for reading this article, I hope you enjoyed it. You can take this deployment further by configuring an SSL certificate on the load balancer for secure services and also adding a remote backend with S3 (I added code to set this up on the repository in /global/s3). Feel free to follow me on GitHub and kindly star the code repository for the article here.