DEV Community

Cover image for Automated EKS Cost Optimization with AWS Config
Amine AIT AAZIZI
Amine AIT AAZIZI

Posted on

Automated EKS Cost Optimization with AWS Config

Through the past years, I helped a number of organizations to optimize cloud costs in AWS, more particularly additional EKS costs. I mainly used AWS config that assesses, audits, and evaluates the configurations of your resources in your AWS account.

How did I use this service for cost optimization? Well consider a scenario where we can be alerted if a specific EKS cluster is deployed in the account. If this EKS cluster goes to extended support, you will be paying 6x the cost of a regular EKS cluster with a supported version.

This blog post demonstrates how to implement a custom rule in AWS Config, to optimize cost by monitoring EKS cluster. The custom AWS Config rule monitors the account checking the version of EKS cluster running. Then when a deprecated EKS version is deployed, AWS Config will flag it as non-compliant.

Overview of solution

architecture-eks-config

The AWS Config custom rule invokes the AWS Lambda function that detects if an EKS cluster is in a non-supported version. The invocation of this function occurs every time there is a new EKS cluster detected in the account.

The AWS Config custom rules invoke a Lambda function that contains the logic to evaluate whether the EKS cluster is either Compliant or Noncompliant. For this we will be relying on endoflife that provides an API to fetch deprecation data directly.
Nothing will happen if the resource is evaluated as compliant. However, If the resource is evaluated as non-complaint, then the lambda function will send an alert through an AWS Simple Notification Service (Amazon SNS) topic to the administration team which will allow the account administrators to take the corrective action.

An alternative solution would have been to use the AWS managed config rule to manage supported EKS clusters … But there are 2 issues with this solution:

1- This rule requires regular updates each 3/6 months to update the oldestVersionSupported parameter as EKS releases and deprecates versions regularly.
2- If you don’t specify the oldestVersionSupported parameter, AWS will evaluate and consider all the versions including the ones on extended support as compliant … and this is exactly what we are trying to avoid.

Deployment

Prerequisites: You must have setup AWS credentials in your environment

Clone repository

git clone https://github.com/aaitaazizi/custom-aws-config-rules-for-eks.git && cd custom-aws-config-rules-for-eks
Enter fullscreen mode Exit fullscreen mode

Generate cloudformation template using rain

rain pkg aws-config-eks-version-rule/template.yaml --output aws-config-eks-version-rule/template.out.yaml 
Enter fullscreen mode Exit fullscreen mode

Deploy

aws cloudformation deploy \
--template-file custom-eks-version-rule.out.yaml \
--stack-name aws-config-eks-version-rule-stack \
--parameter-overrides NotificationEmail=YOUR_EMAIL  \
--region eu-west-1 \
--capabilities CAPABILITY_IAM \
--no-cli-pager
Enter fullscreen mode Exit fullscreen mode

Multi-account deployment

In an organization and multi-account context, you may have to consider options such as using Cloudformation stacksets or AWS config conformance packs to adapt the current template to your needs. This will deploy your AWS config rule across all the accounts and enable EKS cost efficient governance inside your AWS organization.

An additional rule for non-critical workloads

One of the rules that you could additionally add for non-critical workload is to force EKS cluster to have the STANDARD upgrade policy instead of the EXTENDED upgrade policy which is the default value.

So technically, what you can do is to create a new config rule that considers any EKS cluster with EXTENDED support policy as non compliant. Then, you can add a remediation action in order to force the update of this cluster into a STANDARD policy. This will ensure that all the clusters will auto upgrade as soon as they reach the standard end of support period of the EKS cluster.

It’s important to note that it should be deployed to non-critical and/or non-production environments. Here’s an example of solution you can deploy in your account directly to assess the compliance of EKS cluster upgrade policy (without the remediation for now) accessible via this folder repository

Use cases and benefits

Reduce EKS control-plane spend by avoiding Extended Support premium charges through early detection and timely upgrades.​

Improve security posture by keeping clusters on supported versions that continue receiving fixes and security updates.​

Standardize platform governance by applying one consistent policy to every EKS cluster in the organization. This will strengthen ownership and accountability by making version drift visible to the right service owners via tags, reports and dashboard such as the EKS dashboard.​

Enable proactive lifecycle planning by surfacing clusters nearing end-of-support early enough to schedule maintenance windows.​

What’s next ?

Automating cost governance is the most effective way to prevent EKS budget overruns before they happen. I suggest you give this solution a try and/or adapt it with your own business requirements and let me know what you think. If you have any feedbacks or thoughts, please feel free to submit comments.

Top comments (0)