Tarlan Huseynov

Posted on Jul 4

One tool to rule them all - Terraform: EKS Golang Client & E2E AWS Lambda CI/CD via IaC

#aws #terraform #go #devops

Introduction
Prerequisites
Lambda RBAC permissions
Farewell

Introduction

Hey Folks! In this article, we delve into the seamless integration of Terraform for deploying compiled Lambda functions. During the process I will showcase how AWS Lambda can be effectively used as a Kubernetes client. And as final layer of our sandwich - E2E Terraform CI/CD with GitHub Actions to facilitate continuous infrastructure provisioning and software delivery all in one!

So, here is some backstory. Previously, I wondered how to facilitate a more reactive and event driven interaction with my EKS cluster in a cloud-native environment. And naturally when I think of event-driven approach in AWS context - lambdas and EventBridge are one of the first things that come to mind, and for a good reason 😁 right? Combination of these 2 can provide endless number of solutions for various problems - where an action is needed based on some event or some specific schedule.
We will be looking at one of them, particularly - "How to scale down EKS workloads on a scheduled basis".

We will be using my humble EKS-downscaler app. As a client, it provides similar functionality to an operator with the same name. This app is conceptual and serves as an example of how AWS Lambdas can interact with Kubernetes. With our solution, we will specify namespaces and cron expressions, and Terraform coupled with downscaler lambda will handle everything. This article will showcase Terraform from a perfect angle - as it will be our main tool to manage continuous configuration changes, provision and manage cloud resources, and deliver seamless code-to-Lambda deployment. This approach can be used as a standalone terraform project or integrated into more complex IaC projects by passing inputs directly (e.g., "var.cluster_name" that is required as input could have been passed via remote state or directly from root/child module where we create the eks cluster).

Prerequisites

As you might have guessed, we have some prerequisites:

Terraform
AWS CLI
Go runtime
EKS cluster

You can provision and create and manage your EKS cluster in any way you prefer. It might be App of Apps, eksctl, or manual provisioning, as it is very individual. However, I will be showcasing example kubernetes resources with Terraform code examples.

So, as mentioned earlier - with our EKS cluster managed via Terraform, this project can be a part of it as a separate module or a standalone project decoupled from main code basis. In this article, it will be separate.

Lambda RBAC permissions

To enable our Lambda function to interact with the EKS cluster, we need to grant it specific permissions using Role-Based Access Control (RBAC). This involves defining a Kubernetes group, by creating a cluster role, and establishing a cluster role binding. I am creating those using my main EKS Terraform project. You can go ahead and do the same manually 🙃

Further, We can associate this group with our Lambda role by creating an access entry in the separate project where we will provision the Lambda itself or right in the main eks project or once again go and do it manually via console. For now let's just create RBAC components:

resource "kubernetes_cluster_role" "lambda" {
  metadata {
    name = "lambda-clusterrole"
  }
  rule {
    api_groups = ["*"]
    resources  = ["deployments", "deployments/scale", "statefulsets", "daemonsets", "jobs"]
    verbs      = ["get", "list", "watch", "create", "update", "patch", "delete"]
  }
}

resource "kubernetes_cluster_role_binding" "lambda" {
  metadata {
    name = "lambda-clusterrolebinding"
  }
  role_ref {
    api_group = "rbac.authorization.k8s.io"
    kind      = "ClusterRole"
    name      = kubernetes_cluster_role.lambda.metadata[0].name
  }
  subject {
    kind      = "Group"
    name      = "lambda-group"
    api_group = "rbac.authorization.k8s.io"
  }
}

EKS Lambda Client

Now, let's dive into the heart of our project: the EKS Lambda Client. You'll find all the relevant code to deploy our Lambda function using Terraform in this GitHub repo.

eks-downscaler
├── .github
│   └── workflows
│       └── terraform.yml
├── lambdas
│   └── downscaler
│       ├── go.mod
│       └── main.go
│       
├── modules
│   └── downscaler
│       ├── iam.tf
│       ├── lambda.tf
│       ├── locals.tf
│       ├── scheduler.tf
│       └── variables.tf
├── backend.sh
├── backend.tf
├── main.tf
├── readme.md
├── terraform.tfvars
└── variables.tf

Our repository contains both the Golang Lambda code and the Terraform scripts that manage its deployment. We'll be exploring each component step-by-step, starting with the Terraform S3 backend bootstrapper script. As always this is where the magic begins, setting up the backend for our infrastructure.

backend.sh

#!/bin/bash
set -euo pipefail

PROJECT_NAME=$(basename "$(dirname \"${PWD}\")")
AWS_REGION=${AWS_REGION:-us-east-2}
AWS_PROFILE=${AWS_PROFILE:-default}
AWS_ACCOUNT_ID="$(aws sts get-caller-identity --query Account --output text --profile ${AWS_PROFILE})"
export AWS_PAGER=""

echo -e "Bootstraping terraform backend...\n"
echo PROJECT_NAME: "${PROJECT_NAME}"
echo AWS_REGION: "${AWS_REGION}"
echo AWS_PROFILE: "${AWS_PROFILE}"
echo AWS_ACCOUNT_ID: "${AWS_ACCOUNT_ID}"
echo BUCKET NAME: "terraform-tfstate-${PROJECT_NAME}"
echo DYNAMODB TABLE NAME: terraform-locks
echo -e "\n"

aws s3api create-bucket \
    --region "${AWS_REGION}" \
    --create-bucket-configuration LocationConstraint="${AWS_REGION}" \
    --bucket "terraform-tfstate-${PROJECT_NAME}" \
    --profile "${AWS_PROFILE}"

aws dynamodb create-table \
    --region "${AWS_REGION}" \
    --table-name terraform-locks \
    --attribute-definitions AttributeName=LockID,AttributeType=S \
    --key-schema AttributeName=LockID,KeyType=HASH \
    --provisioned-throughput ReadCapacityUnits=1,WriteCapacityUnits=1 \
    --profile "${AWS_PROFILE}"

cat <<EOF > ./backend.tf
terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.57"
    }
  }

  required_version = ">=1.9.0"

  backend "s3" {
    bucket         = "terraform-tfstate-${PROJECT_NAME}"
    key            = "${PROJECT_NAME}"
    region         = "${AWS_REGION}"
    dynamodb_table = "terraform-locks"
  }
}

provider "aws" {}
EOF

echo -e "\nBackend configuration created successfully!\n"
cat ./backend.tf

This script bootstraps a Terraform backend for managing state files. It captures the project name, AWS region, profile, and account ID. Then, it creates an S3 bucket and a DynamoDB table for state file storage and locking, respectively. Finally, it generates a backend.tf configuration file, linking Terraform to the newly created S3 bucket and DynamoDB table, ensuring secure and organized state management. So to initialize first you will need to define your Access credentials either by using AWS_PROFILE or AWS_SECRET_ACCESS_KEY & AWS_ACCESS_KEY_ID.

Main Configuration File

It would be redundant to go over every file in our repo, as the hcl + go code is defined in idiomatic way, and all the nitty-gritty part should be clear already (iam role and policies for lambda to describe our cluster, EventBridge scheduler resource and its permissions.. all the extras and bla bla bla..)

Our main.tf is the main spot where we define all the configuration for our client. This file references the downscaler module, specifying the EKS cluster name, Lambda source path, scaling schedules, and namespaces to manage. It's like the conductor of an orchestra, ensuring every component works in perfect harmony.

module "downscaler_lambda_client" {
  source             = "./modules/downscaler"
  eks_cluster_name   = var.cluster_name
  ci_env             = var.ci_env
  lambda_source      = "${path.root}/lambdas/downscaler"
  scale_out_schedule = "cron(00 09 ? * MON-FRI *)"
  scale_in_schedule  = "cron(00 18 ? * MON-FRI *)"
  eks_groups         = ["lambda-group"]
  namespaces         = ["development", "test"]
}

Lambda Deployment Process

Now, let's talk about how we build, zip, and deploy our Lambda function in modules/downscaler/lambda.tf.

resource "null_resource" "lambda_build" {
  provisioner "local-exec" {
    working_dir = var.lambda_source
    command     = "go mod tidy && GOARCH=amd64 GOOS=linux go build -o bootstrap main.go"
  }

  triggers = {
    ci_env    = var.ci_env
    file_hash = md5(file("${var.lambda_source}/main.go"))
  }
}

data "archive_file" "lambda_zip" {
  depends_on  = [null_resource.lambda_build]
  type        = "zip"
  source_file = "${var.lambda_source}/bootstrap"
  output_path = "${var.lambda_source}/main.zip"
}

resource "aws_lambda_function" "downscaler_lambda" {
  filename         = data.archive_file.lambda_zip.output_path
  source_code_hash = data.archive_file.lambda_zip.output_base64sha256
  function_name    = var.project_name
  handler          = "main"
  runtime          = "provided.al2023"
  role             = aws_iam_role.lambda_role.arn

  environment {
    variables = { CLUSTER_NAME = var.eks_cluster_name }
  }
}

resource "aws_eks_access_entry" "lambda" {
  cluster_name      = var.eks_cluster_name
  principal_arn     = aws_iam_role.lambda_role.arn
  kubernetes_groups = var.eks_groups
  type              = "STANDARD"
}

Building the Lambda: We begin by checking for changes in our main.go file. Using a hash function, we detect any modifications and trigger a rebuild of the Lambda function. This ensures our deployment is always up-to-date with the latest code changes.
We also have a cheeky ci_env = var.ci_env trigger here, which identifies if we are running the project locally or from CI. Motive is to rebuild our application every time if we are applying in Github Actions CI Context.
Zipping the Lambda: After building the Lambda, we zip the compiled executable. This zipped file becomes the core package for our Lambda function, ready for deployment.
Deploying the Lambda: With Terraform, we deploy the Lambda function using the zipped file. Terraform handles all the heavy lifting, ensuring the Lambda is correctly set up and configured to interact with our EKS cluster.
Assigning Permissions: We also need to attach the necessary permissions (previously created RBAC resources) to local Lambda role by using Access Entries.
All AWS sided permissions -> iam resources - roles, necessary policies are defined in iam.tf file

This elegant process, managed by Terraform, ensures that our Lambda function is built, zipped, and deployed seamlessly.

Lambda client code

package main

import (
    "context"
    "encoding/json"
    "github.com/aws/aws-lambda-go/lambda"
    eksauth "github.com/chankh/eksutil/pkg/auth"
    log "github.com/sirupsen/logrus"
    autoscalingv1 "k8s.io/api/autoscaling/v1"
    metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    "k8s.io/client-go/kubernetes"
    "os"
)

type Payload struct {
    ClusterName string   `json:"clusterName"`
    Namespaces  []string `json:"namespaces"`
    Replicas    int32    `json:"replicas"`
}

func main() {
    if os.Getenv("ENV") == "DEBUG" {
        log.SetLevel(log.DebugLevel)
    }

    lambda.Start(handler)
}

func handler(ctx context.Context, payload Payload) (string, error) {
    cfg := &eksauth.ClusterConfig{
        ClusterName: payload.ClusterName,
    }

    clientset, err := eksauth.NewAuthClient(cfg)
    if err != nil {
        log.WithError(err).Error("Failed to create EKS client")
        return "", err
    }

    scaled := make(map[string]int32)

    for _, ns := range payload.Namespaces {
        deployments, err := clientset.AppsV1().Deployments(ns).List(ctx, metav1.ListOptions{})
        if err != nil {
            log.WithError(err).Errorf("Failed to list deployments in namespace %s", ns)
            continue
        }

        for _, deploy := range deployments.Items {
            if err := scaleDeploy(clientset, ctx, ns, deploy.Name, payload.Replicas); err == nil {
                scaled[ns+"/"+deploy.Name] = payload.Replicas
            }
        }
    }

    scaledJSON, err := json.Marshal(scaled)
    if err != nil {
        log.WithError(err).Error("Failed to marshal scaled deployments to JSON")
        return "", err
    }

    log.Info("Scaled Deployments: ", string(scaledJSON))
    return "Scaled Deployments: " + string(scaledJSON), nil
}

func scaleDeploy(client *kubernetes.Clientset, ctx context.Context, namespace, name string, replicas int32) error {
    scale := &autoscalingv1.Scale{
        ObjectMeta: metav1.ObjectMeta{
            Name:      name,
            Namespace: namespace,
        },
        Spec: autoscalingv1.ScaleSpec{
            Replicas: replicas,
        },
    }

    _, err := client.AppsV1().Deployments(namespace).UpdateScale(ctx, name, scale, metav1.UpdateOptions{})
    if err != nil {
        log.WithError(err).Errorf("Failed to scale deployment %s in namespace %s", name, namespace)
    } else {
        log.Infof("Successfully scaled deployment %s in namespace %s to %d replicas", name, namespace, replicas)
    }
    return err
}

Our Lambda client code adds the last piece of logic behind scaling operations in the EKS cluster. It starts by defining a Payload structure, which includes the cluster name, namespaces, and desired replicas. The main function sets up the Lambda handler, which initiates the scaling process. And obviously we are picking the name of the cluster as an environment variable - which is originally propagated via terraform resource.

The handler creates an EKS client, lists deployments in the specified namespaces, and scales each deployment to the desired number of replicas. The scaled deployments are then logged and returned as a JSON response. This ensures our deployments are dynamically scaled based on the defined schedules, providing efficient resource management.

Naturally, all logs flow directly to CloudWatch Logs, where we can observe all the details regarding invocations and it's output.

Terraform CI/CD

It's continuous delivery and continuous provisioning time! 🚀🚀🚀
Our CI/CD workflow is defined in the terraform.yml file within the .github/workflows directory. This workflow ensures that our Terraform configurations are automatically applied whenever changes are pushed to the main branch.

name: Terraform CI/CD
run-name: "Terraform CI/CD | triggered by @${{ github.actor }}"

on:
  push:
    branches:
      - 'main'

jobs:
  terraform-apply:
    runs-on: ubuntu-latest

    env:
      TF_VAR_cluster_name: ${{ secrets.CLUSTER_NAME }}
      TF_VAR_ci_env: true
      AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
      AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
      AWS_REGION: us-east-2

    steps:
      - uses: actions/checkout@v4

      - uses: hashicorp/setup-terraform@v3
        with:
          terraform_version: 1.9.0

      - name: Terraform Init
        run: terraform init

      - name: Terraform Validate
        run: terraform validate

      - name: Terraform Apply
        run: terraform apply -auto-approve

The CI/CD pipeline kicks off by checking out the code from the repository and setting up Terraform. It then initializes Terraform, validates the configuration, and applies the changes to deploy our infrastructure.

It is always better to have extra linting/testing/scanning in terraform CI, another topic for another day maybe😉

Secrets and Environment Variables

Key secrets and environment variables are passed into the workflow to ensure secure and proper configuration. We use the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY for authenticating with AWS, and CLUSTER_NAME to specify the EKS cluster name. These secrets are securely stored in GitHub's repository settings.

As you may already know TF_VAR_ prefix, is handy to override/define values via environment variables - especially in CI environment. This ensures that even if variables are defined elsewhere or gitignored (in our case tfvars files are gitignored), our CI/CD pipeline uses the CI values. For example, TF_VAR_ci_env is set to true in the CI environment, enforcing rebuilds (via local-provisioner trigger) and ensuring changes are accurately reflected in the deployment.

Farewell 😊

In this article, we’ve explored a comprehensive method for deploying and managing AWS resources using Terraform, AWS Lambda, and EventBridge. We delved into the seamless integration of Terraform for deploying compiled Lambda functions, showcased how AWS Lambda can effectively interact with EKS, and implemented full-scale automation with GitHub Actions. These three pearls highlight the power of Terraform in creating a cohesive and dynamic deployment strategy.

Thank you for following along. I hope this guide has provided you with valuable insights and can serve as a reference for your future projects. Keep exploring, keep learning, and continue refining your cloud practices!

"One tool to rule them all, one tool to deploy them, One tool to automate 'em, and with terraform apply them; in the cloud where serverless hides"

DEV Community

One tool to rule them all - Terraform: EKS Golang Client & E2E AWS Lambda CI/CD via IaC

Table of Contents

Introduction

Prerequisites

Lambda RBAC permissions

Main Configuration File

Lambda Deployment Process

Lambda client code

Terraform CI/CD

Secrets and Environment Variables

Farewell 😊

Top comments (0)

Read next

How to create and connect to a Linux VM using a Public Key.

Setting Up Elasticsearch and Kibana Single-Node with Docker Compose

Terraform Init – Command Overview

Learnings from GenAI on AWS at Deloitte workshop