DEV Community

Cover image for EKS Auto Mode Unlocked for Existing Clusters with Terraform
Jatin Mehrotra for AWS Community Builders

Posted on

14 1 1 1

EKS Auto Mode Unlocked for Existing Clusters with Terraform

In the previous blog, I explained that EKS Auto mode is now supported by terraform-eks-module and illustrated how we can create new cluster with EKS Auto Mode.

In this blog, we’ll learn how to enable EKS Auto Mode on existing clusters and migrate workloads from EKS Managed Node Groups to EKS Auto nodes with ZERO DOWNTIME and continued application availability using my terraform code.

I have also added a BONUS section which explains how we can control our pod's deployments on EKS Auto Mode nodes or other compute types.

Motivation

tf Aws 5.81

Githhub Issue for bug fix

  • Terraform-aws-eks release a new version v20.31.1 which allows to use custom NodeClass/NodePools when EKS Auto is enabled without built-in NodePools.

terraform eks module 20.31.1

I want this blog to be really short, crisp and efficient so lets jump into actual steps!

Deploy Terraform cluster without EKS Auto Mode

  • We want to create the use case where we have an existing cluster WITHOUT EKS Auto Mode using EKS MNG.

  • Use this repository code to deploy EKS cluster with Managed node group.

Note: I am attaching policies to the node IAM role for EKS MNG - this is too permissive, better to use EKS Pod Identity (or IRSA, but EKS Pod Identity is preferred). Feel free to send a PR to the repo :)

Deploy workload or pods

  • We will automate this as well using terraform's kubectl_manifest resource, we will deploy workload yaml code using terraform

Note: During cluster creation, test workload(pods) were not deployed because kubectl context was not set locally. So run the following command to set the kubectl context and run terraform apply again once cluster is created.

aws eks --region us-east-1 update-kubeconfig --name eks-existing-cluster-tf-test --profile <your-profile-name> ; terraform apply

Enter fullscreen mode Exit fullscreen mode

Current state of EKS cluster before EKS Auto Mode

  • Let's verify the current state of eks cluster when EKS Auto mode is not enabled.

  • EKS Auto mode is disabled.

Diabled Auto Mode

  • EKS Auto Managed Node group created by me is running.

eks MNG

  • Pods are running on EKS managed node group

pods

pods_nodes_status

Enable EKS Auto Mode on Existing Cluster

  • Uncomment the following code to the eks.tf and terraform apply to enable EKS Auto Mode
bootstrap_self_managed_addons = true

cluster_compute_config = {
   enabled = true
}
Enter fullscreen mode Exit fullscreen mode
  • bootstrap_self_managed_addons = true is very important otherwise you will face error where terraform tries to recreate the cluster again. I literally cried over this

Current state of EKS cluster after EKS Auto Mode

cluster mode enabled on existing cluster

Empty built-in NodePools

  • As expected built-in NodePools are empty

Migrate workload(pods) from EKS MNG to EKS Auto Node

  • There are couple of ways to smoothly migrate existing workloads from MNG to EKS Auto with minimal disruption while maintaining application’s availability throughout the migration.

Note: Copy the EKS MNG node group name.

Using eksctl tool

  • The following command will cordon all nodes and all pods are evicted from a nodegroup and EKS will provision pods to node managed by EKS Auto.
eksctl drain nodegroup --cluster=<clusterName> --name=<copiedNodegroupName>  --region us-east-1 --profile=<profile>

Enter fullscreen mode Exit fullscreen mode
  • eksctl command evicts pod one at a time which I have tested so application availability is maintained.

  • But if you still want to be 100% sure, you can use the best practice of using pod Disruption budget. We will automate this using terraform so run terraform apply

resource "kubectl_manifest" "test_pdb" {
  yaml_body = <<YAML
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: test-pdb
  labels:
    environment: test
spec:
  minAvailable: 1
  selector:
    matchLabels:
      environment: test
YAML
}

Enter fullscreen mode Exit fullscreen mode

Node during cordon

Pod Migrated to eks Auto Node

pod events

  • After migrating; If we want to allow scheduling pods to EKS MNG we need to uncordon the EKS MNG or you can delete the Node group
eksctl drain nodegroup --cluster=<clusterName> --name=<copiedNodegroupName>  --region us-east-1 --profile=<profile> --undo
Enter fullscreen mode Exit fullscreen mode

uncordon nodes

Using kubectl

  • we can use the following command to drain the nodes using kubectl
kubectl drain --ignore-daemonsets <node name>
Enter fullscreen mode Exit fullscreen mode
  • Once it returns (without giving an error), you can delete the node or you want to tell Kubernetes that it can resume scheduling new pods onto the node
kubectl uncordon <node name>
Enter fullscreen mode Exit fullscreen mode

[ BONUS ] How to schedule Pods always on EKS Auto Nodes?

  • There are 2 options to achieve this :
  1. Either delete The NodeGroup and let EKS Auto handle the scheduling on EKS Auto Nodes

  2. Using labels and NodeAffinity

Control if a workload is deployed on EKS Auto Mode nodes

  • There is concept called mix-mode cluster where you’re running both EKS Auto Mode and other compute types, such as self-managed Karpenter provisioners or EKS Managed Node Groups.

  • In mix mode clusters by default deployment is deployed to EKS MNG nodes and not EKS Auto Nodes

  • In such case we can use labels and nodeAffinity.

Using NodeSelector label

  • Use the label eks.amazonaws.com/compute-type: auto when you want a workload is deployed to EKS Auto Node.
  • This nodeSelector value is only relevant if you are running a cluster in a mixed mode, node types not managed by EKS Auto Mode
apiVersion: apps/v1
kind: Deployment
spec:
      nodeSelector:
        eks.amazonaws.com/compute-type: auto
Enter fullscreen mode Exit fullscreen mode
  • I have an added the above configuration in sample_app_on_eks_auto_nodes.tf file. We are automating using Terraform so uncomment and run `terraform apply.

workload on eks auto nodes

nodeSelector labels

Using nodeAffinity

  • You can add this nodeAffinity to Deployments or other workloads to require Kubernetes to not schedule them onto EKS Auto Mode nodes

Node Affinity config

workload not on auto node

node affinity

From DevOps, IaC Perspective

  • We saw how we can enable EKS Auto mode for Existing clusters with built-in NodePools using terraform-eks-module

  • We also saw how we can migrate our existing workload from EKS Managed Group to EKS Auto Nodes without any down time as EKS Auto node respect PodDisruptionBudget.

  • We also saw how we can use nodeSelector Labels and nodeAffinity to control deployment of workload in case of mixed-mode EKS clusters.

Currently EKS Auto deploys EC2 of instance type c6a.large which can be also customized using nodeClass and NodePool which we will see in the next blog. Follow me on Linkedin or on dev.to so that you get timely updates of what I share.

Feel free to reach out to me on Linkedin, X if you face any error migrating your Existing workloads to EKS Auto Mode Nodes using terraform.

Heroku

This site is built on Heroku

Join the ranks of developers at Salesforce, Airbase, DEV, and more who deploy their mission critical applications on Heroku. Sign up today and launch your first app!

Get Started

Top comments (8)

Collapse
 
deepak_b profile image
Deepak B

Thanks for the post Jatin, This is very helpful.

I have quick query on how do we create custom NodePool and Node Class for existing cluster as we have specific requirements from compute perspective.

Collapse
 
jatinmehrotra profile image
Jatin Mehrotra • Edited

THANK YOU @deepak_b. I am happy that it helped you.

Here is the documentation

docs.aws.amazon.com/eks/latest/use...
docs.aws.amazon.com/eks/latest/use...

Nodepool requires to have nodeclass in the cluster.

Let me know if you have any other questions would be happy to help and I am very curious to understand how you are using EKS auto for your needs and how it is helping you. Any challenges you are facing or you wish they have included some specific feature in EKS Auto? Please share your views.

Collapse
 
deepak_b profile image
Deepak B

Thanks Jatin.. was wondering on how can we add it to the terraform code for existing cluster so that when migrate our workload to Auto mode we make sure its using custom nodeclass and nodePool

Thread Thread
 
jatinmehrotra profile image
Jatin Mehrotra

@deepak_b Thank you for your question

how can we add it to the terraform code for existing cluster so that when migrate our workload to Auto mode we make sure its using custom nodeclass and nodePool

There is one consideration before you can add custom nodeClass and nodePool

  • nodeClass and nodePool are the concepts of karpenter. Having said that it also means the EKS Auto when enabled on EKS installs karpenter and CRD to support nodeClass and nodePool

  • This means first you need to enable EKS Auto mode on your cluster then using terraform install nodeClass and nodePool

  • nodeClass and nodePool essentially are yaml manifest which can be applied using terraform using kubectl_manifest resource

Note: In order to use this resource you need to connect respective kubectl provider and configure it so that it can connect to eks cluster

provider "kubectl" {
  host                   = var.eks_cluster_endpoint
  cluster_ca_certificate = base64decode(var.eks_cluster_ca)
  token                  = data.aws_eks_cluster_auth.main.token
  load_config_file       = false
}
Enter fullscreen mode Exit fullscreen mode

Hope it helps.

Thread Thread
 
deepak_b profile image
Deepak B

Thanks Jatin. This helps :)

Thread Thread
 
jatinmehrotra profile image
Jatin Mehrotra

I am also curious how you are trying to use EKS Auto for what kind of use case and why did you guys decided to move to EKS Auto? Please share it if you have time

Collapse
 
harik8 profile image
Hari Karthigasu

bootstrap_self_managed_addons = true saved my day :D, Thanks!

Collapse
 
jatinmehrotra profile image
Jatin Mehrotra

@harik8

I lost my few hairs for this :D Happy that it helped. Share it with Other EKS lovers or in your Linkedin Network if you think it can help others too

Create a simple OTP system with AWS Serverless cover image

Create a simple OTP system with AWS Serverless

Implement a One Time Password (OTP) system with AWS Serverless services including Lambda, API Gateway, DynamoDB, Simple Email Service (SES), and Amplify Web Hosting using VueJS for the frontend.

Read full post

👋 Kindness is contagious

Discover a treasure trove of wisdom within this insightful piece, highly respected in the nurturing DEV Community enviroment. Developers, whether novice or expert, are encouraged to participate and add to our shared knowledge basin.

A simple "thank you" can illuminate someone's day. Express your appreciation in the comments section!

On DEV, sharing ideas smoothens our journey and strengthens our community ties. Learn something useful? Offering a quick thanks to the author is deeply appreciated.

Okay