Idakwoji Theophilus

Posted on Jun 1, 2020

A guide to provisioning AWS ECS Fargate using Terraform

#devops #aws #docker #terraform

This post attempts to distill lessons learned from provisioning the infrastructure and deployment of a containerized NodeJS web service to AWS making use of Terraform and ECS (Elastic Container Service).

What is AWS ?

AWS (Amazon Web Services) is a secure cloud services platform, offering compute power, database storage, content delivery, and other functionality to help businesses scale and grow.

Among the vast number of services provided by AWS, the one in focus today is AWS ECS.

What is AWS ECS ?

Amazon Elastic Container Service (Amazon ECS) is a scalable, high-performance container orchestration service that supports Docker containers and allows you to easily run and scale containerized applications on AWS.

It is amazon's way of allowing us to run and manage Containers at scale. ECS eliminates the need for us to install and run our orchestration engine for running, monitoring, and managing our clusters.

In order to store and access our Docker images at scale, amazon also provides ECR (Elastic Container Repository) which is a fully-managed Docker container registry that makes it easy for developers to store, manage, and deploy Docker container images.

While we love the benefits that ECS brings via orchestration, monitoring, etc. Deploying ECS can be a rather difficult error-prone task that would benefit from the immutability that Infrastructure as code provides. This is where Terraform shines.

What is Terraform ?

Terraform is a tool for building, changing, and versioning infrastructure safely and efficiently. Terraform can manage existing and popular service providers as well as custom in-house solutions.

It allows you to describe your infrastructure via configuration files. Once you have your files in place, the Terraform CLI allows you to spin up cloud resources from the command line.

As a bonus, it also helps us to avoid the AWS dashboard 😄

We will be making use of Terraform to initially provision the infrastructure for our service, and eventually use it to apply updates to our application & infrastructure as required.

It is important we outline a high level overview of how we intend to carry out our deployment

Provisioning Infrastructure on AWS
- Create Security Groups
- Create Application Load Balancer
  - Configure Load Balancer Listener
  - Configure Load Balancer Target Groups
- Create ECR
- Create ECR Lifecycle Policy
- Configure IAM Role for ECS Execution
- Create Task Definitions
- Create ECS Service
- Create Cloudwatch group for ECS logs
Build and Tag Docker Image
Push Tagged Docker Image to ECR
Update Task Definition to point to newly built Docker Image

Now that we have a high level overview of what we are attempting to achieve, lets dive in

Provisioning Infrastructure on AWS

We are going to provision the infrastructure required to run our application in the cloud successfully using Terraform's AWS Provider.

In order to give access to the Terraform AWS Provider, we need to define our AWS region and credentials.

provider "aws" {
  region     = "eu-west-2"
  access_key = "my-access-key"
  secret_key = "my-secret-key"
}

Note: AWS creates a default VPC (Virtual Private Cloud) and a set of default subnets for each AWS account which we will be using, therefore this post will not be covering the creation of new VPCs, subnets, etc.

Terraform provides a data source that allows us to read available information from our AWS account, computed for use elsewhere in Terraform configuration

We retrieve information about our default VPC and Subnets below

data "aws_vpc" "default" {
  default = true
}

data "aws_subnet_ids" "default" {
  vpc_id = "${data.aws_vpc.default.id}"
}

1. Create Security Groups

A security group acts as a virtual firewall for your instance to control inbound and outbound traffic.

We are going to be setting up security groups for the following

The Application Load Balancer which will receive traffic from the internet
ECS which will be receiving traffic from our Application load balancer.

resource "aws_security_group" "lb" {
  name        = "lb-sg"
  description = "controls access to the Application Load Balancer (ALB)"

  ingress {
    protocol    = "tcp"
    from_port   = 80
    to_port     = 80
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    protocol    = "-1"
    from_port   = 0
    to_port     = 0
    cidr_blocks = ["0.0.0.0/0"]
  }
}

resource "aws_security_group" "ecs_tasks" {
  name        = "ecs-tasks-sg"
  description = "allow inbound access from the ALB only"

  ingress {
    protocol        = "tcp"
    from_port       = 4000
    to_port         = 4000
    cidr_blocks     = ["0.0.0.0/0"]
    security_groups = [aws_security_group.lb.id]
  }

  egress {
    protocol    = "-1"
    from_port   = 0
    to_port     = 0
    cidr_blocks = ["0.0.0.0/0"]
  }
}

2. Create Application Load Balancer

A load balancer serves as the single point of contact for clients. The load balancer distributes incoming application traffic across multiple targets, such as EC2 instances, in multiple Availability Zones. It consists of Listeners, Rules, Target Groups & Targets

A listener checks for connection requests from clients, using the protocol and port that you configure. Rules determine how the listener routes requests to its registered targets within specified target groups.

Lets create one for our application

resource "aws_lb" "staging" {
  name               = "alb"
  subnets            = data.aws_subnet_ids.default.ids
  load_balancer_type = "application"
  security_groups    = [aws_security_group.lb.id]

  tags = {
    Environment = "staging"
    Application = "dummyapi"
  }
}

resource "aws_lb_listener" "https_forward" {
  load_balancer_arn = aws_lb.staging.arn
  port              = 80
  protocol          = "HTTP"

  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.staging.arn
  }
}

resource "aws_lb_target_group" "staging" {
  name        = "dummyapi-alb-tg"
  port        = 80
  protocol    = "HTTP"
  vpc_id      = data.aws_vpc.default.id
  target_type = "ip"

  health_check {
    healthy_threshold   = "3"
    interval            = "90"
    protocol            = "HTTP"
    matcher             = "200-299"
    timeout             = "20"
    path                = "/"
    unhealthy_threshold = "2"
  }
}

3. Create ECR

Elastic Container Repository is responsible for storing our docker images which can be fetched, built and deployed on ECS.

We are going to create the repository to hold the docker image we will be building from our application

resource "aws_ecr_repository" "repo" {
  name = "dummyapi/staging/runner"
}

4. Create ECR Lifecycle Policy

Amazon ECR lifecycle policies enable you to specify the lifecycle management of images in a repository.

resource "aws_ecr_lifecycle_policy" "repo-policy" {
  repository = aws_ecr_repository.repo.name

  policy = <<EOF
{
  "rules": [
    {
      "rulePriority": 1,
      "description": "Keep image deployed with tag latest",
      "selection": {
        "tagStatus": "tagged",
        "tagPrefixList": ["latest"],
        "countType": "imageCountMoreThan",
        "countNumber": 1
      },
      "action": {
        "type": "expire"
      }
    },
    {
      "rulePriority": 2,
      "description": "Keep last 2 any images",
      "selection": {
        "tagStatus": "any",
        "countType": "imageCountMoreThan",
        "countNumber": 2
      },
      "action": {
        "type": "expire"
      }
    }
  ]
}
EOF
}

5. Create IAM Role for ECS Execution

An IAM Role is an entity that defines a set of permissions for making AWS service requests.

We will require one to execute our ECS Tasks

data "aws_iam_policy_document" "ecs_task_execution_role" {
  version = "2012-10-17"
  statement {
    sid     = ""
    effect  = "Allow"
    actions = ["sts:AssumeRole"]

    principals {
      type        = "Service"
      identifiers = ["ecs-tasks.amazonaws.com"]
    }
  }
}

resource "aws_iam_role" "ecs_task_execution_role" {
  name               = "ecs-staging-execution-role"
  assume_role_policy = data.aws_iam_policy_document.ecs_task_execution_role.json
}

resource "aws_iam_role_policy_attachment" "ecs_task_execution_role" {
  role       = aws_iam_role.ecs_task_execution_role.name
  policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy"
}

6. Create Task Definitions

This a blueprint that describes how a docker container should launch.

A running instance based on a Task Definition is called a Task.

//dummyapp.json.tpl

[
  {
    "name": "dummyapi",
    "image": "${aws_ecr_repository}:${tag}",
    "essential": true,
    "logConfiguration": {
      "logDriver": "awslogs",
      "options": {
        "awslogs-region": "eu-west-2",
        "awslogs-stream-prefix": "dummyapi-staging-service",
        "awslogs-group": "awslogs-dummyapi-staging"
      }
    },
    "portMappings": [
      {
        "containerPort": 4000,
        "hostPort": 4000,
        "protocol": "tcp"
      }
    ],
    "cpu": 1,
    "environment": [
      {
        "name": "NODE_ENV",
        "value": "staging"
      },
      {
        "name": "PORT",
        "value": "4000"
      }
    ],
    "ulimits": [
      {
        "name": "nofile",
        "softLimit": 65536,
        "hardLimit": 65536
      }
    ],
    "mountPoints": [],
    "memory": 2048,
    "volumesFrom": []
  }
]

data "template_file" "sproutlyapp" {
  template = file("./dummyapp.json.tpl")
  vars = {
    aws_ecr_repository = aws_ecr_repository.repo.repository_url
    tag                = "latest"
    app_port           = 80
  }
}

resource "aws_ecs_task_definition" "service" {
  family                   = "dummyapi-staging"
  network_mode             = "awsvpc"
  execution_role_arn       = aws_iam_role.ecs_task_execution_role.arn
  cpu                      = 256
  memory                   = 2048
  requires_compatibilities = ["FARGATE"]
  container_definitions    = data.template_file.sproutlyapp.rendered
  tags = {
    Environment = "staging"
    Application = "dummyapi"
  }
}

7. Create ECS Service

An Amazon ECS service enables you to run and maintain a specified number of instances of a task definition simultaneously in an Amazon ECS cluster.

We will be combining a couple of resources defined earlier to setup and run our service

resource "aws_ecs_service" "staging" {
  name            = "staging"
  cluster         = aws_ecs_cluster.staging.id
  task_definition = aws_ecs_task_definition.service.arn
  desired_count   = 1
  launch_type     = "FARGATE"

  network_configuration {
    security_groups  = [aws_security_group.ecs_tasks.id]
    subnets          = data.aws_subnet_ids.default.ids
    assign_public_ip = true
  }

  load_balancer {
    target_group_arn = aws_lb_target_group.staging.arn
    container_name   = "sproutlyapi"
    container_port   = 4000
  }

  depends_on = [aws_lb_listener.https_forward, aws_iam_role_policy_attachment.ecs_task_execution_role]

  tags = {
    Environment = "staging"
    Application = "sproutlyapi"
  }
}

8. Create CloudWatch group for ECS logs

You can configure the containers in your tasks to send log information to CloudWatch Logs.

We will provision this in order to be able to collect and view logs from our containers

resource "aws_cloudwatch_log_group" "dummyapi" {
  name = "awslogs-dummyapi-staging"

  tags = {
    Environment = "staging"
    Application = "dummyapi"
  }
}

With that setup, we will need a way to build our local application environment into a Docker Image, Tag it and push it to the Elastic Container Repository where it will be picked up by ECR via the rules specified in the lifecycle policy and run as a Task on our ECS Cluster.

We will make use of a Bash script to carry out these steps, and thanks to Terraform's local-exec provisioner, we will be able to run the script during the provisioning of our infrastructure

var.source_path is the path to where your application's Dockerfile (required to build Docker Image) resides

// example -> ./push.sh . 123456789012.dkr.ecr.us-west-1.amazonaws.com/hello-world latest

resource "null_resource" "push" {
  provisioner "local-exec" {
     command     = "${coalesce("push.sh", "${path.module}/push.sh")} ${var.source_path} ${aws_ecr_repository.repo.repository_url} ${var.tag}"
     interpreter = ["bash", "-c"]
  }
}

#!/bin/bash
# 
# Builds a Docker image and pushes to an AWS ECR repository

# name of the file - push.sh

set -e

source_path="$1" # 1st argument from command line
repository_url="$2" # 2nd argument from command line
tag="${3:-latest}" # Checks if 3rd argument exists, if not, use "latest"

# splits string using '.' and picks 4th item
region="$(echo "$repository_url" | cut -d. -f4)" 

# splits string using '/' and picks 2nd item
image_name="$(echo "$repository_url" | cut -d/ -f2)" 

# builds docker image
(cd "$source_path" && docker build -t "$image_name" .) 

# login to ecr
$(aws ecr get-login --no-include-email --region "$region") 

# tag image
docker tag "$image_name" "$repository_url":"$tag"

# push image
docker push "$repository_url":"$tag"

We can proceed to run terraform plan which will give us an overview of how our infrastructure is going to be provisioned before actually being provisioned.

This is an essential feature of Terraform as it ensures we validate our infrastructure before execution.

On successful execution of terraform plan, we can finally execute our provision plan by running terraform apply

Note that terraform applies changes in place, meaning if there are already some resources provisioned on AWS that match what we have defined in our configuration, it either updates it or destroy's it and then provision's it again as required.

Once terraform apply is complete, all our resources would have been created, and should be accessible via the URL provided by our Application Load Balancer.

Thank you for reading. If you liked this post, please leave a ❤️ or a comment below.

Thanks to Wale & Habeeb for reading initial drafts of this post.

Top comments (7)

Kevin Seghetti • Jun 9 '21

There are a few other issues with this example, and having to copy each piece into a file is a pain. So, I have shared a github project I made from this that (as of Jun 8th, 2021) fully works out of the box. (even added a simple docker image for testing).
github.com/KevinSeghetti/aws-terra...

Charles Peach • Aug 24 '20 • Edited

Heya, following all the example terraform config, on first terraform plan I get the following errors:

Error: Reference to undeclared resource

  on main.tf line 192, in resource "aws_ecs_service" "staging":
 192:   cluster         = aws_ecs_cluster.staging.id

A managed resource "aws_ecs_cluster" "staging" has not been declared in the
root module.

Error: Reference to undeclared input variable

  on main.tf line 231, in resource "null_resource" "push":
 231:      command     = "${coalesce("push.sh", "${path.module}/push.sh")} ${var.source_path} ${aws_ecr_repository.repo.repository_url} ${var.tag}"

An input variable with the name "source_path" has not been declared. This
variable can be declared with a variable "source_path" {} block.

Error: Reference to undeclared input variable

  on main.tf line 231, in resource "null_resource" "push":
 231:      command     = "${coalesce("push.sh", "${path.module}/push.sh")} ${var.source_path} ${aws_ecr_repository.repo.repository_url} ${var.tag}"

An input variable with the name "tag" has not been declared. This variable can
be declared with a variable "tag" {} block.

The tag one is simple, I can just remove the tag or add a tag variable, but what about the other issues?

Charles Peach • Aug 24 '20

Fixed my issues, was missing these variables:

variable "source_path" {
  description = "source path for project"
  default     = "./"
}

variable "tag" {
  description = "tag to use for our new docker image"
  default     = "latest"
}

And was also missing the ECS cluster definition:

resource "aws_ecs_cluster" "staging" {
  name = "tf-ecs-cluster"
}

ama • Jun 1 '20

Great article Pietro! I’ve been needing something like this.

Ryan Lv • May 14 '21

Is the database manually created and passed to terraform as a variable?

or database is created through terraform?

cci-faonder • Feb 20 '23

great article, still in 2023. thanks

Achyutamba Akella • Feb 14 '21

does this have a github code?

Forem