<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Taufique</title>
    <description>The latest articles on DEV Community by Taufique (@taufique_c757012ce6181590).</description>
    <link>https://dev.to/taufique_c757012ce6181590</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3716697%2Ff1d94c71-e972-4ac6-9558-98993983d1e1.jpg</url>
      <title>DEV Community: Taufique</title>
      <link>https://dev.to/taufique_c757012ce6181590</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/taufique_c757012ce6181590"/>
    <language>en</language>
    <item>
      <title>EKS Cluster Upgrade steps with Zero downtime</title>
      <dc:creator>Taufique</dc:creator>
      <pubDate>Tue, 20 Jan 2026 04:25:47 +0000</pubDate>
      <link>https://dev.to/taufique_c757012ce6181590/eks-cluster-upgrade-steps-with-zero-downtime-9cb</link>
      <guid>https://dev.to/taufique_c757012ce6181590/eks-cluster-upgrade-steps-with-zero-downtime-9cb</guid>
      <description>&lt;p&gt;EKS Cluster Upgrade steps with Zero downtime.&lt;/p&gt;

&lt;p&gt;Introduction: &lt;/p&gt;

&lt;p&gt;Amazon Elastic Kubernetes Service (EKS) is a managed Kubernetes service offered by Amazon Web Services (AWS) that simplifies the process of deploying, managing, and scaling containerized applications using Kubernetes on AWS infrastructure. Kubernetes is an open-source container orchestration platform that automates the deployment, scaling, and management of containerized applications.&lt;/p&gt;

&lt;p&gt;EKS abstracts away the complexities of Kubernetes cluster management, allowing users to focus on developing and running their applications. With EKS, AWS manages the control plane, which includes components like the API server, scheduler, and etcd, ensuring high availability and scalability of the Kubernetes control plane.&lt;/p&gt;

&lt;p&gt;One of the key concepts in EKS is node groups. Node groups are a collection of EC2 instances (virtual machines) that act as worker nodes in the Kubernetes cluster. These nodes run the containerized applications and execute tasks such as scheduling, networking, and storage. Node groups can be configured with various instance types, sizes, and configurations to meet specific workload requirements.&lt;/p&gt;

&lt;p&gt;Node groups in EKS are highly flexible and scalable, allowing users to add or remove nodes dynamically based on workload demands. This elasticity ensures optimal resource utilization and cost efficiency. Additionally, node groups can be spread across multiple Availability Zones for improved fault tolerance and high availability.&lt;/p&gt;

&lt;p&gt;Objective: &lt;/p&gt;

&lt;p&gt;As a microservice cluster, EKS undergoes periodic upgrades to maintain its stability, security, and performance. It's essential to keep the EKS cluster updated as AWS may discontinue support for older clusters over time. Regularly updating the EKS cluster ensures compatibility with AWS services, receives ongoing support, and incorporates new features and security patches provided by AWS.&lt;/p&gt;

&lt;p&gt;Overview:&lt;/p&gt;

&lt;p&gt;When upgrading an EKS cluster from one version to another (e.g., from 1.24 to 1.28), it's crucial to minimize downtime for applications. Typically, during the upgrade process, you may encounter errors related to incompatible node groups. To avoid downtime,follow below steps. &lt;/p&gt;

&lt;p&gt;Key components:&lt;/p&gt;

&lt;p&gt;Create Manifest for Intermediate Version: Generate a manifest for the intermediate version (e.g., 1.26) of the EKS cluster.&lt;/p&gt;

&lt;p&gt;Create Node Group with Intermediate Version: Deploy a new node group using the intermediate version (1.26). This ensures compatibility between the cluster and the new nodes.&lt;/p&gt;

&lt;p&gt;Update Deployment Affinity: Update the affinity settings in your application's deployment configuration to ensure a smooth rolling update to the new node group. Set the max surge to 100% to maintain application availability during the update.&lt;/p&gt;

&lt;p&gt;Delete Old Node Group: Once the new node group is successfully deployed and your application is running smoothly on it, delete the old node group with the previous version (1.24).&lt;/p&gt;

&lt;p&gt;Update EKS Cluster Version: In the EKS console, update the cluster version from the intermediate version (1.26) to the target version (1.28).&lt;/p&gt;

&lt;p&gt;Create Manifest for Target Version: Generate a new manifest for the target version (1.28) of the EKS cluster.&lt;/p&gt;

&lt;p&gt;Create Node Group with Target Version: Deploy a new node group using the target version (1.28) of the EKS cluster.&lt;/p&gt;

&lt;p&gt;Delete Intermediate Node Group: Once the new node group with the target version is operational and applications are running smoothly, delete the node group with the intermediate version (1.26).&lt;/p&gt;

&lt;p&gt;Update kubectl Version: Finally, update the kubectl version on your local machine to the latest compatible version with the upgraded EKS cluster.&lt;/p&gt;

&lt;p&gt;By following these steps, you can minimize downtime during the EKS cluster upgrade process and ensure the smooth transition of your applications to the new cluster version.&lt;/p&gt;

&lt;p&gt;Cluster Upgrade process in EKS console:&lt;/p&gt;

&lt;p&gt;To begin, ensure your cluster, currently at version 1.24. To ensure optimal performance and compatibility, aim to keep your cluster updated to the latest N-1 version, currently 1.28, while the latest is 1.29, ensuring access to new features while maintaining stability and support from AWS.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpv712s0luwyiaz9r52q4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpv712s0luwyiaz9r52q4.png" alt=" " width="800" height="135"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Within the cluster, there is one node group that also requires updating to the latest N-1 version, aligning with the cluster's version 1.24, to ensure seamless operation and compatibility with AWS services.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp4v9ggcirohtppvitxri.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp4v9ggcirohtppvitxri.png" alt=" " width="800" height="84"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Select the "Upgrade Now" option to initiate the upgrade process from version 1.24 to version 1.25, ensuring the cluster remains up-to-date with the latest enhancements and features.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgx1juew3kg3f2i0vuw0g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgx1juew3kg3f2i0vuw0g.png" alt=" " width="800" height="571"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The cluster has been successfully updated to the newer version, ensuring it remains current with the latest features, improvements, and security enhancements.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb16q8493ma11jl64zhkd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb16q8493ma11jl64zhkd.png" alt=" " width="800" height="132"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Repeat the previous steps to upgrade the cluster from version 1.25 to version 1.26, ensuring continued alignment with the latest advancements and optimizations in Kubernetes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvx28v3xjuo5c36xxmc42.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvx28v3xjuo5c36xxmc42.png" alt=" " width="800" height="124"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After upgrading the cluster to version 1.26, you may encounter an error when attempting to further upgrade to version 1.27 due to the need to update the node groups to match the cluster version. Refer to the provided image for details on the error message.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fobkgf7p07k8jhurhcy7u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fobkgf7p07k8jhurhcy7u.png" alt=" " width="800" height="571"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Creating a Manifest file with 1.26version: &lt;/p&gt;

&lt;p&gt;To minimize downtime, we can create new node groups with version 1.26, avoiding the need to delete existing ones. Below is the manifest file for creating the new node groups:&lt;/p&gt;

&lt;p&gt;apiVersion: eksctl.io/v1alpha5&lt;/p&gt;

&lt;p&gt;kind: ClusterConfig&lt;/p&gt;

&lt;p&gt;metadata:&lt;/p&gt;

&lt;p&gt;name: zerodowntime-cluster&lt;/p&gt;

&lt;p&gt;region: ap-south-2&lt;/p&gt;

&lt;p&gt;version: "1.26"&lt;/p&gt;

&lt;p&gt;vpc:&lt;/p&gt;

&lt;p&gt;id: "vpc-0d13d6b46ae0a1e28"&lt;/p&gt;

&lt;p&gt;cidr: "172.31.0.0/16"&lt;/p&gt;

&lt;p&gt;subnets:&lt;/p&gt;

&lt;p&gt;public:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; Zero-Pub-2a:

   id: "subnet-075876fd0bcead8a2"

 Zero-Pub-2c:

   id: "subnet-0d97057ad00f8f0c9"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;managedNodeGroups:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;name: zerodowntime-clusterNG-new&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;ami: "ami-0879d75b626dd9f41"&lt;/p&gt;

&lt;p&gt;amiFamily: AmazonLinux2&lt;/p&gt;

&lt;p&gt;overrideBootstrapCommand: |&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; #!/bin/bash

 /etc/eks/bootstrap.sh zerodowntime-cluster --container-runtime containerd
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;minSize: 1&lt;/p&gt;

&lt;p&gt;maxSize: 2&lt;/p&gt;

&lt;p&gt;desiredCapacity: 1&lt;/p&gt;

&lt;p&gt;instanceType: "t3.xlarge"&lt;/p&gt;

&lt;p&gt;volumeSize: 20&lt;/p&gt;

&lt;p&gt;volumeEncrypted: true&lt;/p&gt;

&lt;p&gt;privateNetworking: true&lt;/p&gt;

&lt;p&gt;subnets:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; - Zero-Pub-2a

 - Zero-Pub-2c
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;labels: {role: zero-downtime-test}&lt;/p&gt;

&lt;p&gt;ssh:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; publicKeyName: zero-downgrade-cluster
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;tags:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; nodegroup-role: Zero-downgrade-cluster-role

 nodegroup-name: zerodowntime-clusterNG-new

 Project: POC

 Env: Zero-app

 Layer: App-OD

 Managedby: Workmates
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;iam:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; attachPolicyARNs:

   - arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy

   - arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy

   - arn:aws:iam::aws:policy/ElasticLoadBalancingFullAccess

   - arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly

   - arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy

 withAddonPolicies:

   autoScaler: true

   externalDNS: true

   certManager: true

   ebs: true

   efs: true

   albIngress: true

   cloudWatch: true
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Access the Jenkins server and switch to the Jenkins user. Create a new YAML manifest file, save it with a .yaml extension, and paste the provided manifest for creating the new node groups with version 1.26.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvqdkk24vp4bugnenq1hi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvqdkk24vp4bugnenq1hi.png" alt=" " width="800" height="105"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The following is the content of the manifest file saved in YAML format.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ttej5u0ldvy7sgik4wi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ttej5u0ldvy7sgik4wi.png" alt=" " width="513" height="606"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Verify the available node groups in the cluster using the following command.&lt;/p&gt;

&lt;p&gt;eksctl get nodegroup --cluster= --region=&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F99hbk4imhpmcf2s4jgc1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F99hbk4imhpmcf2s4jgc1.png" alt=" " width="800" height="60"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As you can see in above picture there is only one nodegroup&lt;/p&gt;

&lt;p&gt;Creating the nodegroups with 1.26 version manifest file:&lt;/p&gt;

&lt;p&gt;Create new nodegroups with the new 1.26version manifest using below command.&lt;/p&gt;

&lt;p&gt;eksctl create nodegroup -f &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9huuv86n5nt62y3seaop.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9huuv86n5nt62y3seaop.png" alt=" " width="800" height="337"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now you can see the second node group with 1.26v is successfully created.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5tk1pcucm8pmng3cscy4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5tk1pcucm8pmng3cscy4.png" alt=" " width="800" height="100"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Update the affinity in deployment file with a rolling update of 100% as max surge.&lt;/p&gt;

&lt;p&gt;Updating Affinity in Deployment:&lt;/p&gt;

&lt;p&gt;Once the node groups are created:&lt;/p&gt;

&lt;p&gt;Modify the deployment YAML files to adjust the affinity:&lt;/p&gt;

&lt;p&gt;strategy:&lt;/p&gt;

&lt;p&gt;type: RollingUpdate&lt;/p&gt;

&lt;p&gt;rollingUpdate:&lt;/p&gt;

&lt;p&gt;maxSurge: 100%&lt;/p&gt;

&lt;p&gt;maxUnavailable: 0%&lt;/p&gt;

&lt;p&gt;Also, update the labels.&lt;/p&gt;

&lt;p&gt;Do the same Affinity changes for All Applications:&lt;/p&gt;

&lt;p&gt;1)  Repeat the above process for all deployments, ensuring consistent updates to the affinity and labels across all applications. This ensures that all applications utilize the new node groups effectively.&lt;/p&gt;

&lt;p&gt;Deleting the node groups with older version: &lt;/p&gt;

&lt;p&gt;Once check the nodegroups with cluster with below command.&lt;/p&gt;

&lt;p&gt;eksctl get nodegroup --cluster= --region=&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgq7jz5rzj4x1747j6shu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgq7jz5rzj4x1747j6shu.png" alt=" " width="800" height="65"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Since we have two nodegroups we can go-ahead to delete the old nodegroup with below command.&lt;/p&gt;

&lt;p&gt;eksctl delete nodegroup --cluster= --name= --region  --disable-eviction&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjduipf14j1j3rdt7qamj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjduipf14j1j3rdt7qamj.png" alt=" " width="800" height="109"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now you can see the old nodegroup is deleted.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faskio0aqae2ozi4r2kql.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faskio0aqae2ozi4r2kql.png" alt=" " width="800" height="89"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Upgrading the cluster from 1.26 to 1.28 version in EKS console:&lt;/p&gt;

&lt;p&gt;Since the new node groups with cluster version are matched go ahead and update the cluster from 1.26v to 1.27v.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp7v35kcnyj69z6t0qlcy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp7v35kcnyj69z6t0qlcy.png" alt=" " width="800" height="132"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F88emsi5pb4qgkcxlc47l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F88emsi5pb4qgkcxlc47l.png" alt=" " width="800" height="443"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The cluster has been successfully updated to the newer version, ensuring it remains current with the latest features, improvements, and security enhancements.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fochrsdo07blzwn6uth5y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fochrsdo07blzwn6uth5y.png" alt=" " width="800" height="131"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Repeat the previous steps to upgrade the cluster from version 1.27 to version 1.28, ensuring continued alignment with the latest advancements and optimizations in Kubernetes.&lt;/p&gt;

&lt;p&gt;Now my cluster is upgraded to 1.28 version.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fczabq4af6ntbesx3b4td.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fczabq4af6ntbesx3b4td.png" alt=" " width="800" height="146"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Creating the Manifest file with 1.28version:&lt;/p&gt;

&lt;p&gt;Now we need to prepare a manifest with 1.28 version to update the node groups to 1.28.&lt;/p&gt;

&lt;p&gt;First retrieve the ami id which supports cluster version 1.28. Use the below command to retrieve the ami id. Kindly specify the region.&lt;/p&gt;

&lt;p&gt;aws ssm get-parameter --name /aws/service/eks/optimized-ami/1.28/amazon-linux-2/recommended/image_id --region ap-south-2 --query "Parameter.Value" --output text&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcsai4u69k0uwa9ru9bu9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcsai4u69k0uwa9ru9bu9.png" alt=" " width="800" height="35"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Add the ami id in manifest and save the file.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff6ezcl6a1c16rekj3rn8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff6ezcl6a1c16rekj3rn8.png" alt=" " width="800" height="91"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This manifest file is for 1.28version of eks cluster nodegroups.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnohj3er8mbncxebbhgow.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnohj3er8mbncxebbhgow.png" alt=" " width="527" height="603"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Check the below manifest file and update the necessary fields.&lt;/p&gt;

&lt;p&gt;apiVersion: eksctl.io/v1alpha5&lt;/p&gt;

&lt;p&gt;kind: ClusterConfig&lt;/p&gt;

&lt;p&gt;metadata:&lt;/p&gt;

&lt;p&gt;name: zerodowntime-cluster&lt;/p&gt;

&lt;p&gt;region: ap-south-2&lt;/p&gt;

&lt;p&gt;version: "1.28"&lt;/p&gt;

&lt;p&gt;vpc:&lt;/p&gt;

&lt;p&gt;id: "vpc-0d13d6b46ae0a1e28"&lt;/p&gt;

&lt;p&gt;cidr: "172.31.0.0/16"&lt;/p&gt;

&lt;p&gt;subnets:&lt;/p&gt;

&lt;p&gt;public:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; Zero-Pub-2a:

   id: "subnet-075876fd0bcead8a2"

 Zero-Pub-2c:

   id: "subnet-0d97057ad00f8f0c9"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;managedNodeGroups:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;name: zerodowntime-clusterNG&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;ami: "ami-06cd43657081b7a56"&lt;/p&gt;

&lt;p&gt;amiFamily: AmazonLinux2&lt;/p&gt;

&lt;p&gt;overrideBootstrapCommand: |&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; #!/bin/bash

 /etc/eks/bootstrap.sh zerodowntime-cluster --container-runtime containerd
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;minSize: 1&lt;/p&gt;

&lt;p&gt;maxSize: 2&lt;/p&gt;

&lt;p&gt;desiredCapacity: 1&lt;/p&gt;

&lt;p&gt;instanceType: "t3.xlarge"&lt;/p&gt;

&lt;p&gt;volumeSize: 20&lt;/p&gt;

&lt;p&gt;volumeEncrypted: true&lt;/p&gt;

&lt;p&gt;privateNetworking: true&lt;/p&gt;

&lt;p&gt;subnets:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; - Zero-Pub-2a

 - Zero-Pub-2c
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;labels: {role: zero-downtime-test}&lt;/p&gt;

&lt;p&gt;ssh:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; publicKeyName: zero-downgrade-cluster
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;tags:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; nodegroup-role: Zero-downgrade-cluster-role

 nodegroup-name: zerodowntime-clusterNG

 Project: POC

 Env: Zero-app

 Layer: App-OD

 Managedby: Workmates
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;iam:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; attachPolicyARNs:

   - arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy

   - arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy

   - arn:aws:iam::aws:policy/ElasticLoadBalancingFullAccess

   - arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly

   - arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy

 withAddonPolicies:

   autoScaler: true

   externalDNS: true

   certManager: true

   ebs: true

   efs: true

   albIngress: true

   cloudWatch: true
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Creating the nodegroups with 1.28 version manifest file:&lt;/p&gt;

&lt;p&gt;Once check the nodegroups available for the cluster with below command.&lt;/p&gt;

&lt;p&gt;eksctl get nodegroup --cluster= --region=&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flxexan3baly4b5cnmxh3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flxexan3baly4b5cnmxh3.png" alt=" " width="800" height="63"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As you can see in the above image i have only one node group with -new tag&lt;/p&gt;

&lt;p&gt;Now go-ahead and run the manifest file to create new updated node groups with 1.28version.&lt;/p&gt;

&lt;p&gt;eksctl create nodegroup -f &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frhhadjsfuvckj22njgxe.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frhhadjsfuvckj22njgxe.png" alt=" " width="800" height="199"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now you can see a new nodegroup is created with 1.28version which supports eks cluster.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F35ofkgzqla9nmf6y133f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F35ofkgzqla9nmf6y133f.png" alt=" " width="800" height="102"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;2)  Updating Affinity in Deployment:&lt;/p&gt;

&lt;p&gt;Once the node groups are created:&lt;/p&gt;

&lt;p&gt;Modify the deployment YAML files to adjust the affinity:&lt;/p&gt;

&lt;p&gt;strategy:&lt;/p&gt;

&lt;p&gt;type: RollingUpdate&lt;/p&gt;

&lt;p&gt;rollingUpdate:&lt;/p&gt;

&lt;p&gt;maxSurge: 100%&lt;/p&gt;

&lt;p&gt;maxUnavailable: 0%&lt;/p&gt;

&lt;p&gt;Also, update the labels.&lt;/p&gt;

&lt;p&gt;Do the same Affinity changes for All Applications:&lt;/p&gt;

&lt;p&gt;3)  Repeat the above process for all deployments, ensuring consistent updates to the affinity and labels across all applications. This ensures that all applications utilize the new node groups effectively.&lt;/p&gt;

&lt;p&gt;Deleting the node groups with older version: &lt;/p&gt;

&lt;p&gt;Once the affinity is updated delete the nodegroups with -new tag which is 1.26version with below command.&lt;/p&gt;

&lt;p&gt;eksctl delete nodegroup --cluster= --name= --region ap-south-2 --disable-eviction&lt;/p&gt;

&lt;p&gt;As you can see in below image the node group with -new name is deleting state.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fduphxc0nowyx9erwvah1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fduphxc0nowyx9erwvah1.png" alt=" " width="800" height="103"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now the old node group is deleted which remains only one node group with 1.28version&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fozuq0dv088t9zshp9pnf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fozuq0dv088t9zshp9pnf.png" alt=" " width="800" height="90"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Installing or updating kubectl:&lt;/p&gt;

&lt;p&gt;Determine which version of kubectl you are using with below command.&lt;/p&gt;

&lt;p&gt;kubectl version --client&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjvfdq9ivmaiy0pv3ubm8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjvfdq9ivmaiy0pv3ubm8.png" alt=" " width="800" height="57"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Use the below AWS documentation link for information on kubectl commands.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://docs.aws.amazon.com/eks/latest/userguide/install-kubectl.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/eks/latest/userguide/install-kubectl.html&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Install or update kubectl on macOS, Linux, and Windows operating systems.&lt;/p&gt;

&lt;p&gt;curl -O &lt;a href="https://s3.us-west-2.amazonaws.com/amazon-eks/1.28.8/2024-04-19/bin/linux/amd64/kubectl" rel="noopener noreferrer"&gt;https://s3.us-west-2.amazonaws.com/amazon-eks/1.28.8/2024-04-19/bin/linux/amd64/kubectl&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4yl37hnuuatyc6bctsav.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4yl37hnuuatyc6bctsav.png" alt=" " width="800" height="65"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Apply execute permissions to the binary.&lt;/p&gt;

&lt;p&gt;chmod +x ./kubectl&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7t8r212mdiuyvdm1lcvy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7t8r212mdiuyvdm1lcvy.png" alt=" " width="800" height="72"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Copy the binary to a folder in your PATH. If you have already installed a version of kubectl, then we recommend creating a $HOME/bin/kubectl and ensuring that $HOME/bin comes first in your $PATH.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg4913m7lp08vh5tzlzwj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg4913m7lp08vh5tzlzwj.png" alt=" " width="800" height="39"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;(Optional) Add the $HOME/bin path to your shell initialization file so that it is configured when you open a shell.&lt;/p&gt;

&lt;p&gt;echo 'export PATH=$HOME/bin:$PATH' &amp;gt;&amp;gt; ~/.bashrc&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsebhz0i33uymjm18zyi0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsebhz0i33uymjm18zyi0.png" alt=" " width="800" height="49"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Determine which version of kubectl you are using with below command.&lt;/p&gt;

&lt;p&gt;kubectl version --client&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0g303s0h76vu84y93q5n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0g303s0h76vu84y93q5n.png" alt=" " width="800" height="114"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>devops</category>
      <category>kubernetes</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Alerting Setup in EKS using Prometheus</title>
      <dc:creator>Taufique</dc:creator>
      <pubDate>Sat, 17 Jan 2026 15:54:43 +0000</pubDate>
      <link>https://dev.to/taufique_c757012ce6181590/alerting-setup-in-eks-using-prometheus-159p</link>
      <guid>https://dev.to/taufique_c757012ce6181590/alerting-setup-in-eks-using-prometheus-159p</guid>
      <description>&lt;p&gt;&lt;strong&gt;&lt;u&gt;Introduction&lt;/u&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Overview:&lt;/p&gt;

&lt;p&gt;This SOP provides detailed instructions for configuring alerting in an Amazon EKS cluster using Prometheus. Prometheus is an open-source monitoring and alerting toolkit widely used in Kubernetes environments for real-time monitoring and proactive alerting based on metrics. The integration ensures timely notifications about potential issues, enabling swift action to maintain system health and reliability.&lt;/p&gt;

&lt;p&gt;Prometheus:&lt;/p&gt;

&lt;p&gt;Prometheus is a powerful open-source monitoring system designed for collecting, storing, and querying time-series data. It is highly scalable and well-suited for cloud-native environments, especially Kubernetes. Prometheus collects metrics from configured targets, evaluates defined rules, and enables queries using its PromQL language. With its robust integration capabilities, Prometheus is a cornerstone of modern observability stacks, offering insights into application and infrastructure performance. &lt;/p&gt;

&lt;p&gt;Alerts for Pods:&lt;/p&gt;

&lt;p&gt;Alerting for pods involves monitoring the health, resource usage, and performance of Kubernetes pods and triggering alerts when specific thresholds or conditions are breached. For example, alerts can be set for high CPU or memory usage, pod restarts, or readiness and liveness probe failures. This ensures teams are promptly notified of potential issues, enabling them to take corrective actions to maintain application availability and reliability. &lt;/p&gt;

&lt;p&gt;Objective:&lt;/p&gt;

&lt;p&gt;The objective of this SOP is to:&lt;/p&gt;

&lt;p&gt;Set up Prometheus in an EKS cluster for monitoring.&lt;br&gt;
Configure alerting rules for key performance metrics and resource utilization.&lt;br&gt;
Integrate Prometheus with Alertmanager to route alerts to notification channels like email.&lt;br&gt;
Key Components:&lt;/p&gt;

&lt;p&gt;Amazon EKS: The managed Kubernetes service that hosts your applications.&lt;br&gt;
Prometheus: The monitoring and alerting toolkit for collecting metrics.&lt;br&gt;
Alertmanager: A component of Prometheus responsible for managing alerts and routing them to configured endpoints.&lt;br&gt;
Kubernetes Metrics Server: A lightweight service for gathering resource metrics like CPU and memory.&lt;br&gt;
     Prerequisites:&lt;/p&gt;

&lt;p&gt;EKS Cluster: A fully operational EKS cluster with kubectl configured for access.&lt;br&gt;
Prometheus Operator: Deployed in the EKS cluster for managing Prometheus configurations.&lt;br&gt;
Alertmanager: Installed alongside Prometheus in the cluster.&lt;br&gt;
IAM Permissions: Sufficient AWS IAM permissions to manage resources in the EKS cluster.&lt;/p&gt;

&lt;p&gt;Procedure&lt;br&gt;
Initial Setup:&lt;/p&gt;

&lt;p&gt;Login to the server:&lt;br&gt;
ssh -i "" &lt;a class="mentioned-user" href="https://dev.to/ip"&gt;@ip&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg6wgkobi35vj5zbp98pz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg6wgkobi35vj5zbp98pz.png" alt=" " width="800" height="375"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Login to Jenkins User:&lt;/strong&gt;&lt;br&gt;
sudo su - jenkins&lt;/p&gt;

&lt;p&gt;cd &lt;/p&gt;

&lt;p&gt;mkdir &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F31k9q338mtkuhv0mnja8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F31k9q338mtkuhv0mnja8.png" alt=" " width="769" height="110"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;cd &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F01pkc5a0whkrtddah94k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F01pkc5a0whkrtddah94k.png" alt=" " width="800" height="89"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Installation of Prometheus Stack:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Check prometheus Repo if not present add the Prometheus through Helm Package Manager of K8’s:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;helm repo ls&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq2veul93v6q6r5269qu9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq2veul93v6q6r5269qu9.png" alt=" " width="800" height="110"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Repo is not added so, add the prom repo using below commands:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;helm repo add prometheus-community &lt;a href="https://prometheus-community.github.io/helm-charts" rel="noopener noreferrer"&gt;https://prometheus-community.github.io/helm-charts&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;helm repo update&lt;/p&gt;

&lt;p&gt;helm pull prometheus-community/kube-prometheus-stack&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flqfyxo9fz58y12cgsm9n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flqfyxo9fz58y12cgsm9n.png" alt=" " width="800" height="136"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Extract the tar file&lt;/p&gt;

&lt;p&gt;tar -xvf &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6uqlz87lz5cd1cx1oy5y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6uqlz87lz5cd1cx1oy5y.png" alt=" " width="800" height="346"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Adding Affinity in the Values file:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Find the values.yaml file inside prom stack.&lt;br&gt;
cd into the extracted tar file and check the files.&lt;br&gt;
Take a backup of it. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftvsyqfog39nipmx3facu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftvsyqfog39nipmx3facu.png" alt=" " width="800" height="216"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And copy the content indise the values file and open it in IDE for modification.&lt;/p&gt;

&lt;p&gt;Before doing the modification find the role of the Node Group find as below.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4e71w80j8922sqtev5bx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4e71w80j8922sqtev5bx.png" alt=" " width="800" height="98"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm1bt3xctivc5z5dlewuz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm1bt3xctivc5z5dlewuz.png" alt=" " width="800" height="355"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Use the below syntax for Affinity:&lt;/p&gt;

&lt;p&gt;**affinity: &lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; nodeAffinity:

    requiredDuringSchedulingIgnoredDuringExecution:

        nodeSelectorTerms:

         - matchExpressions:

            - key: role

            operator: In

            values:

            - Production-Magnifi-Monitoring-NG
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;**&lt;br&gt;
Before modification of Affinity&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4de742iif732gvrwuhxc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4de742iif732gvrwuhxc.png" alt=" " width="709" height="301"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After Modification:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmnk9egu8o8m8ovdlwku8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmnk9egu8o8m8ovdlwku8.png" alt=" " width="747" height="289"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After modification of entire yaml file replace with old content with newly modified content.&lt;/p&gt;

&lt;p&gt;Go to charts folder and find grafana folder futher find the values file inside the grafana folder.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F728i76jiarun22xoqrje.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F728i76jiarun22xoqrje.png" alt=" " width="800" height="142"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Follow above steps to modify the affinity in grafana values file.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Deploy the prom stack using Helm Package Manager:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;helm install prom-stack . -f values.yaml -n monitoring --create-namespace&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgnzmgiw8xsd622smih90.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgnzmgiw8xsd622smih90.png" alt=" " width="800" height="133"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Check all pods are in running state:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;k get all -n monitoring&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo5mq76xd2acfzrizboww.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo5mq76xd2acfzrizboww.png" alt=" " width="800" height="472"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Setting up the Alerts:&lt;/p&gt;

&lt;p&gt;Delete all the default rules of Prometheus expect the below one:&lt;/p&gt;

&lt;p&gt;Delete all the rules expect &lt;/p&gt;

&lt;p&gt;**prom-stack-kube-prometheus-kubernetes-apps&lt;/p&gt;

&lt;p&gt;prom-stack-kube-prometheus-k8s.rules.container-cpu-usage-second**&lt;/p&gt;

&lt;p&gt;Find the prometheusrule.&lt;/p&gt;

&lt;p&gt;k get prometheusrules -n monitoring&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs5qhmvbnkisa6f82ua72.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs5qhmvbnkisa6f82ua72.png" alt=" " width="800" height="576"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Delete the default rules&lt;/p&gt;

&lt;p&gt;k delete prometheusrules  -n &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbmgs4td6dnttc4bjibzc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbmgs4td6dnttc4bjibzc.png" alt=" " width="800" height="343"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Check the rules we left&lt;/p&gt;

&lt;p&gt;**prom-stack-kube-prometheus-k8s.rules.container-cpu-usage-second&lt;/p&gt;

&lt;p&gt;prom-stack-kube-prometheus-kubernetes-apps **&lt;/p&gt;

&lt;p&gt;k get prometheusrules -n monitoring&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq7da5p37dsjjxm87ltlg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq7da5p37dsjjxm87ltlg.png" alt=" " width="800" height="78"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Edit the above rule.&lt;/p&gt;

&lt;p&gt;k edit prometheusrules prom-stack-kube-prometheus-kubernetes-apps -n monitoring&lt;/p&gt;

&lt;p&gt;Delete the all the rules under spec below the rules.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fih69oa1m869w2m2cb96e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fih69oa1m869w2m2cb96e.png" alt=" " width="800" height="464"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Add the new rules as given below:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo2er6cd2qwodmhhs894j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo2er6cd2qwodmhhs894j.png" alt=" " width="660" height="618"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reference Rule:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;rules:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;alert: MagnifiProductionPodRestart&lt;/p&gt;

&lt;p&gt;annotations:&lt;/p&gt;

&lt;p&gt;description: Pod {{ $labels.namespace }}/{{ $labels.pod }} has restarted {{&lt;/p&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; $value }} times.
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;expr: kube_pod_container_status_restarts_total{namespace="prod"} &amp;gt; 0&lt;/p&gt;

&lt;p&gt;for: 1m&lt;/p&gt;

&lt;p&gt;labels:&lt;/p&gt;

&lt;p&gt;severity: production-critical&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;alert: MagnifiProductionPodPending&lt;/p&gt;

&lt;p&gt;annotations:&lt;/p&gt;

&lt;p&gt;description: Pod {{ $labels.namespace }}/{{ $labels.pod }} has pending {{&lt;/p&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; $value }} times.
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;expr: kube_pod_status_phase{namespace="prod", phase="Pending"} == 1&lt;/p&gt;

&lt;p&gt;for: 1m&lt;/p&gt;

&lt;p&gt;labels:&lt;/p&gt;

&lt;p&gt;severity: production-critical  &lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Add the SMTP credentails in prometheus secrets, Follow the below steps:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Create SMPT user.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk2tzn2wp3maqbzktz341.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk2tzn2wp3maqbzktz341.png" alt=" " width="800" height="76"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0sik663fglzbp5yo7ajm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0sik663fglzbp5yo7ajm.png" alt=" " width="800" height="260"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Create Access Key and Secret Key:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkrtpq2g3isgpzu4zbvjc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkrtpq2g3isgpzu4zbvjc.png" alt=" " width="562" height="67"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Use the below format and add the below credentails.&lt;/p&gt;

&lt;p&gt;global:&lt;/p&gt;

&lt;p&gt;resolve_timeout: 5m&lt;/p&gt;

&lt;p&gt;smtp_from: &lt;/p&gt;

&lt;p&gt;smtp_smarthost: email-smtp.ap-south-1.amazonaws.com:587&lt;/p&gt;

&lt;p&gt;smtp_auth_username: &lt;/p&gt;

&lt;p&gt;smtp_auth_password: &lt;/p&gt;

&lt;p&gt;smtp_require_tls: true&lt;/p&gt;

&lt;p&gt;route:&lt;/p&gt;

&lt;p&gt;receiver: support&lt;/p&gt;

&lt;p&gt;group_by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;job&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;monitor_type&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;severity&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;alertname&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;namespace&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;routes: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;receiver: support&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;match:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; alertname: &amp;lt;Alert_Name_1&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;ul&gt;
&lt;li&gt;receiver: support&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;match:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; alertname: &amp;lt;Alert_Name_2&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;group_wait: 30s&lt;/p&gt;

&lt;p&gt;group_interval: 5m&lt;/p&gt;

&lt;p&gt;repeat_interval: 1h&lt;/p&gt;

&lt;p&gt;receivers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;name: support&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;email_configs: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;send_resolved: true&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;to: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;send_resolved: true&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;to: &lt;/p&gt;

&lt;p&gt;templates:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;'/etc/alertmanager/config/*.tmpl'&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;After doing modification encode the above format.&lt;/p&gt;

&lt;p&gt;Link: &lt;a href="https://www.base64encode.org/" rel="noopener noreferrer"&gt;https://www.base64encode.org/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frtws8ox2np11h32eanef.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frtws8ox2np11h32eanef.png" alt=" " width="800" height="231"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Make the change secrets of alert manager:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;To know the secrets in monitoring namespace&lt;/p&gt;

&lt;p&gt;k get secrets -n monitoring&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp8s8ixga3eslpv27g2hy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp8s8ixga3eslpv27g2hy.png" alt=" " width="800" height="181"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Edit the Alert Manager secret &lt;/p&gt;

&lt;p&gt;k edit secrets alertmanager-prom-stack-kube-prometheus-alertmanager -n monitoring&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faq73i3em3etyk3ml90vb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faq73i3em3etyk3ml90vb.png" alt=" " width="800" height="65"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Remove the Old Secret after the alertmanager.yaml.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fisq0eg9tfdl1vdn600ip.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fisq0eg9tfdl1vdn600ip.png" alt=" " width="800" height="333"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Add the encoded secrets.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh6k28057k85v90stgx5c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh6k28057k85v90stgx5c.png" alt=" " width="800" height="82"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Alerts Testing:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Pod Restart Testing:&lt;/p&gt;

&lt;p&gt;vi restart-pod.yaml&lt;/p&gt;

&lt;p&gt;apiVersion: v1&lt;/p&gt;

&lt;p&gt;kind: Pod&lt;/p&gt;

&lt;p&gt;metadata:&lt;/p&gt;

&lt;p&gt;name: always-restart-pod&lt;/p&gt;

&lt;p&gt;spec:&lt;/p&gt;

&lt;p&gt;restartPolicy: Always&lt;/p&gt;

&lt;p&gt;containers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;name: my-container&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;image: nginx:latest&lt;/p&gt;

&lt;p&gt;command: ["/bin/sh", "-c", "exit 1"]&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F39ltont7evcsp9me6kd4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F39ltont7evcsp9me6kd4.png" alt=" " width="800" height="52"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1huhfwxs5c0jrtuuiqgt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1huhfwxs5c0jrtuuiqgt.png" alt=" " width="800" height="138"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Got Alert for Pod Restart Firing and Resolved:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmeo5u53nqyeyjs0fj6yr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmeo5u53nqyeyjs0fj6yr.png" alt=" " width="800" height="459"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fibfi8w3lrmjoq5exerfu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fibfi8w3lrmjoq5exerfu.png" alt=" " width="800" height="455"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Got Alert for Pod Pending Firing and Resolved:&lt;/p&gt;

&lt;p&gt;vi pending-pod.yaml&lt;/p&gt;

&lt;p&gt;apiVersion: v1&lt;/p&gt;

&lt;p&gt;kind: Pod&lt;/p&gt;

&lt;p&gt;metadata:&lt;/p&gt;

&lt;p&gt;name: pending-pod&lt;/p&gt;

&lt;p&gt;spec:&lt;/p&gt;

&lt;p&gt;containers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;name: my-container&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;image: non-existing-image:latest&lt;/p&gt;

&lt;p&gt;k apply -f pending-pod.yaml -n prod&lt;/p&gt;

&lt;p&gt;k delete po pending-pod -n prod&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4bepcpapg127vrnxyr7l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4bepcpapg127vrnxyr7l.png" alt=" " width="800" height="123"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Got Alert for Pod PendingFiring and Resolved:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa48rfv5mw88klg6igkd7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa48rfv5mw88klg6igkd7.png" alt=" " width="800" height="474"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0ufqbzjpanm1ibp2k728.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0ufqbzjpanm1ibp2k728.png" alt=" " width="800" height="468"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Expose grafana prom and alertmanager Services:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Check the services of monitoring&lt;/p&gt;

&lt;p&gt;k get svc -n &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwq9g1ykd0cwelp3i8ull.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwq9g1ykd0cwelp3i8ull.png" alt=" " width="800" height="152"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Create a ingress file for prometheus&lt;/p&gt;

&lt;p&gt;vi prometheus-ingress.yaml&lt;/p&gt;

&lt;p&gt;Replace the Service and port of the related service.&lt;/p&gt;

&lt;p&gt;apiVersion: networking.k8s.io/v1&lt;/p&gt;

&lt;p&gt;kind: Ingress&lt;/p&gt;

&lt;p&gt;metadata:&lt;/p&gt;

&lt;p&gt;annotations:&lt;/p&gt;

&lt;p&gt;kubernetes.io/ingress.class: nginx&lt;/p&gt;

&lt;p&gt;nginx.ingress.kubernetes.io/ssl-redirect: "true"&lt;/p&gt;

&lt;p&gt;nginx.ingress.kubernetes.io/proxy-body-size: 5m&lt;/p&gt;

&lt;p&gt;name: prod-prom-grafana-monitoring&lt;/p&gt;

&lt;p&gt;namespace: monitoring&lt;/p&gt;

&lt;p&gt;spec:&lt;/p&gt;

&lt;p&gt;rules:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;host: grafana-prod-mumbai.illusto.com&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;http:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; paths:

 - backend:

     service:

       name: prom-stack-grafana

       port:

number: 80

   path: /

   pathType: ImplementationSpecific
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;ul&gt;
&lt;li&gt;host: prom-prod-mumbai.illusto.com&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;http:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; paths:

 - backend:

     service:

       name: prom-stack-kube-prometheus-prometheus

       port:

number: 9090

   path: /

   pathType: ImplementationSpecific
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;ul&gt;
&lt;li&gt;host: alert-prod-mumbai.illusto.com&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;http:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; paths:

 - backend:

     service:

       name: prom-stack-kube-prometheus-alertmanager

       port:

number: 9093

   path: /

   pathType: ImplementationSpecific
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Create the ingress rules&lt;/p&gt;

&lt;p&gt;k apply -f prometheus-ingress.yaml&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhkp0waf2zy0bco7vwntb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhkp0waf2zy0bco7vwntb.png" alt=" " width="800" height="72"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Check the ingress.&lt;/p&gt;

&lt;p&gt;k get ingress -n monitoring&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5asn955ep07y4i461ulk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5asn955ep07y4i461ulk.png" alt=" " width="800" height="53"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Add the Route53 records:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flojro5f4ro21fil01cqg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flojro5f4ro21fil01cqg.png" alt=" " width="800" height="72"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnsp6df5ek1x722rfy3rl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnsp6df5ek1x722rfy3rl.png" alt=" " width="766" height="604"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Grafana Dashboard:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhu64klelhca7b5o8t1az.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhu64klelhca7b5o8t1az.png" alt=" " width="800" height="396"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prometheus Dashboard:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo09b4hntrofhdoj73n72.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo09b4hntrofhdoj73n72.png" alt=" " width="800" height="211"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AlertManager Dashboard:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft6y2fy35aueg4urkxbx6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft6y2fy35aueg4urkxbx6.png" alt=" " width="800" height="213"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;u&gt;Scope&lt;/u&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This SOP applies to DevOps, monitoring, and SRE teams tasked with maintaining the reliability and performance of applications deployed on Amazon EKS. It is applicable for both staging and production environments.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;u&gt;Roles and Responsibilities&lt;/u&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;DevOps Engineers: &lt;/p&gt;

&lt;p&gt;1.Responsible for deploying and configuring Prometheus and Alertmanager.&lt;/p&gt;

&lt;p&gt;2.Define and maintain alerting rules based on organizational needs.&lt;/p&gt;

&lt;p&gt;3.Act on received alerts to resolve issues and ensure high availability.&lt;/p&gt;

&lt;p&gt;4.Validate alerting configurations to ensure compliance with security protocols.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;u&gt;Enforcement&lt;/u&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;1.Policy Compliance: All alerting configurations must follow organizational monitoring and alerting standards.&lt;/p&gt;

&lt;p&gt;2.Access Control: Only authorized personnel are allowed to modify Prometheus and Alertmanager configurations.&lt;/p&gt;

&lt;p&gt;3.Auditing: Regular audits of alerting rules should be conducted to ensure effectiveness and compliance.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;u&gt;Conclusion:&lt;/u&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Alerting in EKS using Prometheus provides a robust mechanism to proactively monitor the health and performance of applications. By following this SOP, teams can ensure timely notifications for critical issues, reducing downtime and maintaining application reliability in dynamic Kubernetes environments.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>programming</category>
      <category>ai</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Deploying and Configuring ALB Ingress Controller on EKS</title>
      <dc:creator>Taufique</dc:creator>
      <pubDate>Sat, 17 Jan 2026 15:20:47 +0000</pubDate>
      <link>https://dev.to/taufique_c757012ce6181590/deploying-and-configuring-alb-ingress-controller-on-eks-1g7o</link>
      <guid>https://dev.to/taufique_c757012ce6181590/deploying-and-configuring-alb-ingress-controller-on-eks-1g7o</guid>
      <description>&lt;p&gt;&lt;u&gt;&lt;strong&gt;1. Introduction&lt;/strong&gt;&lt;/u&gt;&lt;/p&gt;

&lt;p&gt;Overview:&lt;/p&gt;

&lt;p&gt;This SOP explains the purpose, scope, and best practices for setting up a ALB Ingress Controller.The AWS Load Balancer Controller manages AWS Elastic Load Balancers for a Kubernetes cluster. You can use the controller to expose your cluster apps to the internet. The controller provisions AWS load balancers that point to cluster Service or Ingress resources. In other words, the controller creates a single IP address or DNS name that points to multiple pods in your cluster.&lt;/p&gt;

&lt;p&gt;The controller watches for Kubernetes Ingress or Service resources. In response, it creates the appropriate AWS Elastic Load Balancing resources. You can configure the specific behavior of the load balancers by applying annotations to the Kubernetes resources. For example, you can attach AWS security groups to load balancers using annotations.&lt;/p&gt;

&lt;p&gt;Objective:&lt;/p&gt;

&lt;p&gt;The objective of this document is to provide a comprehensive guide for implementing the AWS ALB Ingress Controller on AWS. This setup will enable efficient routing and load balancing of incoming traffic to Kubernetes services running on an Amazon Elastic Kubernetes Service (EKS) cluster using Application Load Balancers.&lt;/p&gt;

&lt;p&gt;Key Components:&lt;/p&gt;

&lt;p&gt;AWS ALB Ingress Controller: &lt;/p&gt;

&lt;p&gt;The AWS ALB Ingress Controller is a Kubernetes resource that manages  AWS Application Load Balancer configuration to route incoming traffic to Kubernetes services.&lt;/p&gt;

&lt;p&gt;Amazon EKS: &lt;/p&gt;

&lt;p&gt;Amazon Elastic Kubernetes Service is a managed Kubernetes service provided by AWS that  simplifies the process of deploying, managing, and scaling containerized applications using Kubernetes.&lt;/p&gt;

&lt;p&gt;Amazon Route 53: &lt;/p&gt;

&lt;p&gt;Amazon Route 53 is a scalable and highly available Domain Name System (DNS) web service provided by AWS. It enables routing traffic to AWS resources, including the AWS ALB Ingress Controller.&lt;/p&gt;

&lt;p&gt;AWS Application Load Balancer (ALB): &lt;/p&gt;

&lt;p&gt;ALB automatically distributes incoming application traffic across multiple targets, such as EC2 instances, containers, and IP addresses, within one or more Availability     Zones.&lt;/p&gt;

&lt;p&gt;Prerequisites:&lt;/p&gt;

&lt;p&gt;Before proceeding with the implementation, ensure the following prerequisites are met:&lt;/p&gt;

&lt;p&gt;AWS Account: &lt;/p&gt;

&lt;p&gt;Access to an AWS account with permissions to create and manage resources such as EKS  clusters, Route 53 records, and Application Load Balancers.&lt;/p&gt;

&lt;p&gt;Kubernetes Cluster: &lt;/p&gt;

&lt;p&gt;An Amazon EKS cluster should be provisioned and running. Ensure the cluster is  properly configured with networking, IAM roles, and necessary node groups.&lt;/p&gt;

&lt;p&gt;kubectl CLI: &lt;/p&gt;

&lt;p&gt;Install and configure the kubectl command-line tool to interact with the Kubernetes cluster.&lt;/p&gt;

&lt;p&gt;Helm (Optional):&lt;/p&gt;

&lt;p&gt;If using Helm for deploying the AWS ALB Ingress Controller, ensure Helm is installed  and configured.&lt;/p&gt;

&lt;p&gt;Access to DNS: &lt;/p&gt;

&lt;p&gt;Have access to manage DNS records, as you will need to create DNS records to route  traffic to the AWS ALB Ingress Controller.&lt;/p&gt;

&lt;p&gt;Procedure&lt;/p&gt;

&lt;p&gt;Step Create IAM Role using eksctl&lt;/p&gt;

&lt;p&gt;Create an IAM policy.&lt;/p&gt;

&lt;p&gt;Download an IAM policy for the AWS Load Balancer Controller that allows it to make calls to AWS APIs on your behalf.&lt;/p&gt;

&lt;p&gt;curl -O &lt;a href="https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.7.2/docs/install/iam_policy.json" rel="noopener noreferrer"&gt;https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.7.2/docs/install/iam_policy.json&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F96hy7v6w5im88nqthmmm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F96hy7v6w5im88nqthmmm.png" alt=" " width="800" height="66"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;2.Create an IAM policy using the policy downloaded in the previous step.&lt;/p&gt;

&lt;p&gt;aws iam create-policy \&lt;/p&gt;

&lt;p&gt;--policy-name AWSLoadBalancerControllerIAMPolicy \&lt;/p&gt;

&lt;p&gt;--policy-document file://iam_policy.json&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frtr8eopli61n7o65s0r4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frtr8eopli61n7o65s0r4.png" alt=" " width="800" height="332"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Create IAM Role using eksctl&lt;/p&gt;

&lt;p&gt;Replace my-cluster with the name of your cluster, 111122223333 with your account ID, and then run the command. If your cluster is in the AWS GovCloud (US-East) or AWS GovCloud (US-West) AWS Regions, then replace arn:aws: with arn:aws-us-gov:.&lt;/p&gt;

&lt;p&gt;eksctl create iamserviceaccount \&lt;/p&gt;

&lt;p&gt;--cluster=my-cluster \&lt;/p&gt;

&lt;p&gt;--namespace=kube-system \&lt;/p&gt;

&lt;p&gt;--name=aws-load-balancer-controller \&lt;/p&gt;

&lt;p&gt;--role-name AmazonEKSLoadBalancerControllerRole \&lt;/p&gt;

&lt;p&gt;--attach-policy-arn=arn:aws:iam::111122223333:policy/AWSLoadBalancerControllerIAMPolicy \&lt;/p&gt;

&lt;p&gt;--approve&lt;/p&gt;

&lt;p&gt;eksctl create iamserviceaccount \&lt;/p&gt;

&lt;p&gt;--cluster=demo-cluster \&lt;/p&gt;

&lt;p&gt;--namespace=kube-system \&lt;/p&gt;

&lt;p&gt;--name=aws-load-balancer-controller \&lt;/p&gt;

&lt;p&gt;--role-name AmazonEKSLoadBalancerControllerRole \&lt;/p&gt;

&lt;p&gt;--attach-policy-arn=arn:aws:iam::975050347443:policy/AWSLoadBalancerControllerIAMPolicy \&lt;/p&gt;

&lt;p&gt;--region=us-east-1&lt;/p&gt;

&lt;p&gt;--approve&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2i9izzn2u3rkatxqd7qg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2i9izzn2u3rkatxqd7qg.png" alt=" " width="800" height="116"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The above error is came due to OIDC is not enabled&lt;/p&gt;

&lt;p&gt;Commands to configure IAM OIDC provider&lt;/p&gt;

&lt;p&gt;export cluster_name=demo-cluster&lt;/p&gt;

&lt;p&gt;oidc_id=$(aws eks describe-cluster --name $cluster_name --query "cluster.identity.oidc.issuer" --output text | cut -d '/' -f 5) &lt;/p&gt;

&lt;p&gt;Check if there is an IAM OIDC provider configured already&lt;/p&gt;

&lt;p&gt;eksctl utils associate-iam-oidc-provider --cluster $cluster_name --region us-east-1 --approve&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmhtpzvsnwktc6rv3tkp1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmhtpzvsnwktc6rv3tkp1.png" alt=" " width="800" height="65"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwkv6inffo4fuosbipg79.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwkv6inffo4fuosbipg79.png" alt=" " width="800" height="194"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Step Install AWS Load Balancer Controller&lt;/p&gt;

&lt;p&gt;Install AWS Load Balancer Controller using Helm V3:&lt;/p&gt;

&lt;p&gt;Add the eks-charts Helm chart repository. AWS maintains this repository on GitHub.&lt;/p&gt;

&lt;p&gt;helm repo add eks &lt;a href="https://aws.github.io/eks-charts" rel="noopener noreferrer"&gt;https://aws.github.io/eks-charts&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Update your local repo to make sure that you have the most recent charts.&lt;/p&gt;

&lt;p&gt;helm repo update eks&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9vh87hj1o1kwy8ckz9xu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9vh87hj1o1kwy8ckz9xu.png" alt=" " width="800" height="139"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To view the available versions of the Helm Chart and Load Balancer Controller, use the following command:&lt;/p&gt;

&lt;p&gt;helm search repo eks/aws-load-balancer-controller --versions&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi41niccc26j4qytbhk2m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi41niccc26j4qytbhk2m.png" alt=" " width="800" height="233"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;3.Install the AWS Load Balancer Controller.&lt;/p&gt;

&lt;p&gt;Replace my-cluster with the name of your cluster. In the following command, aws-load-balancer-controller is the Kubernetes service account that you created in a previous step.&lt;/p&gt;

&lt;p&gt;For more information about configuring the helm chart, see values.yaml on GitHub.&lt;/p&gt;

&lt;p&gt;Ref Link: &lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/aws/eks-charts/blob/master/stable/aws-load-balancer-controller/values.yaml" rel="noopener noreferrer"&gt;https://github.com/aws/eks-charts/blob/master/stable/aws-load-balancer-controller/values.yaml&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;helm install aws-load-balancer-controller eks/aws-load-balancer-controller \&lt;/p&gt;

&lt;p&gt;-n kube-system \&lt;/p&gt;

&lt;p&gt;--set clusterName=my-cluster \&lt;/p&gt;

&lt;p&gt;--set serviceAccount.create=false \&lt;/p&gt;

&lt;p&gt;--set serviceAccount.name=aws-load-balancer-controller &lt;/p&gt;

&lt;p&gt;If you're deploying the controller to Amazon EC2 nodes that have restricted access to the Amazon EC2     instance metadata service (IMDS), or if you're deploying to Fargate, then add the following flags to     the helm command that follows:&lt;/p&gt;

&lt;p&gt;--set region=region-code&lt;/p&gt;

&lt;p&gt;--set vpcId=vpc-xxxxxxxx&lt;/p&gt;

&lt;p&gt;helm install aws-load-balancer-controller eks/aws-load-balancer-controller \            &lt;/p&gt;

&lt;p&gt;-n kube-system \&lt;/p&gt;

&lt;p&gt;--set clusterName=demo-cluster \&lt;/p&gt;

&lt;p&gt;--set serviceAccount.create=false \&lt;/p&gt;

&lt;p&gt;--set serviceAccount.name=aws-load-balancer-controller \&lt;/p&gt;

&lt;p&gt;--set autoDiscoverAwsRegion=true \&lt;/p&gt;

&lt;p&gt;--set autoDiscoverAwsVpcID=true&lt;/p&gt;

&lt;p&gt;--set region=us-east-1 \&lt;/p&gt;

&lt;p&gt;--set vpcId=vpc-08ce046624ab1b564&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F82dk9sj6bc2x40qrsigy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F82dk9sj6bc2x40qrsigy.png" alt=" " width="800" height="179"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Step Verify that the controller is installed&lt;/p&gt;

&lt;p&gt;Verify that the controller is installed.&lt;/p&gt;

&lt;p&gt;kubectl get deployment -n kube-system aws-load-balancer-controller&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsux6xsv0tyfuaollg9p8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsux6xsv0tyfuaollg9p8.png" alt=" " width="800" height="94"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Step Deploy a sample application&lt;/p&gt;

&lt;p&gt;Prerequisites:&lt;/p&gt;

&lt;p&gt;At least one public or private subnet in your cluster VPC.&lt;/p&gt;

&lt;p&gt;Have the AWS Load Balancer Controller deployed on your cluster. For more information, see What is the AWS Load Balancer Controller?. We recommend version 2.7.2 or later.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F34w8nxk69pstwgg4qpht.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F34w8nxk69pstwgg4qpht.png" alt=" " width="800" height="43"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Deploy the game 2048 as a sample application.&lt;/p&gt;

&lt;p&gt;To verify that the AWS Load Balancer Controller creates an AWS ALB as a result of the ingress object. &lt;/p&gt;

&lt;p&gt;Complete the steps for the type of subnet you're deploying to.&lt;/p&gt;

&lt;p&gt;If you're deploying to Pods in a cluster that you created with the IPv6 family, skip to the next step.&lt;/p&gt;

&lt;p&gt;If Public execute below command&lt;/p&gt;

&lt;p&gt;kubectl apply -f &lt;a href="https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.7.2/docs/examples/2048/2048_full.yaml" rel="noopener noreferrer"&gt;https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.7.2/docs/examples/2048/2048_full.yaml&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If Private follow below steps&lt;/p&gt;

&lt;p&gt;Download the manifest.&lt;/p&gt;

&lt;p&gt;curl -O &lt;a href="https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.7.2/docs/examples/2048/2048_full.yaml" rel="noopener noreferrer"&gt;https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.7.2/docs/examples/2048/2048_full.yaml&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Edit the file and find the line that says alb.ingress.kubernetes.io/scheme: internet-facing.&lt;br&gt;
Change internet-facing to internal and save the file.&lt;/p&gt;

&lt;p&gt;Apply the manifest to your cluster.&lt;/p&gt;

&lt;p&gt;kubectl apply -f 2048_full.yaml&lt;/p&gt;

&lt;p&gt;If you're deploying to Pods in a cluster that you created with the IPv6 family, complete the following steps.&lt;/p&gt;

&lt;p&gt;Download the manifest.&lt;/p&gt;

&lt;p&gt;curl -O &lt;a href="https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.7.2/docs/examples/2048/2048_full.yaml" rel="noopener noreferrer"&gt;https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.7.2/docs/examples/2048/2048_full.yaml&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Open the file in an editor and add the following line to the annotations in the ingress spec.&lt;/p&gt;

&lt;p&gt;alb.ingress.kubernetes.io/ip-address-type: dualstack&lt;/p&gt;

&lt;p&gt;If you're load balancing to internal Pods, rather than internet facing Pods, &lt;/p&gt;

&lt;p&gt;Change the line that says &lt;/p&gt;

&lt;p&gt;alb.ingress.kubernetes.io/scheme: internet-facing to alb.ingress.kubernetes.io/scheme: internal&lt;/p&gt;

&lt;p&gt;Save the file.&lt;/p&gt;

&lt;p&gt;Apply the manifest to your cluster.&lt;/p&gt;

&lt;p&gt;kubectl apply -f 2048_full.yaml&lt;/p&gt;

&lt;p&gt;After a few minutes, verify that the ingress resource was created with the following command.&lt;/p&gt;

&lt;p&gt;For testing here i have used public &lt;/p&gt;

&lt;p&gt;kubectl apply -f &lt;a href="https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.7.2/docs/examples/2048/2048_full.yaml" rel="noopener noreferrer"&gt;https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.7.2/docs/examples/2048/2048_full.yaml&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5ctt4x781cgd5w54lmj3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5ctt4x781cgd5w54lmj3.png" alt=" " width="800" height="61"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Check the deployed application is working properly or not.&lt;/p&gt;

&lt;p&gt;kubectl get all -n game-2048&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd0qpf6cj32o6694d5698.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd0qpf6cj32o6694d5698.png" alt=" " width="800" height="306"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Check the ingress &lt;/p&gt;

&lt;p&gt;kubectl get ingress -n game-2048&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxkjjgx8tbwveiujfkdpj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxkjjgx8tbwveiujfkdpj.png" alt=" " width="800" height="61"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Check the application using the endpoint&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft8fsblej3zakbob1jxzj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft8fsblej3zakbob1jxzj.png" alt=" " width="800" height="167"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Upto now it is completed ACM is in progress&lt;/p&gt;

&lt;p&gt;Step Create a ACM certifiacte with the required domain.&lt;/p&gt;

&lt;p&gt;Sign in to the AWS Management Console and open the ACM console &lt;/p&gt;

&lt;p&gt;Choose Request a certificate.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3a8nljp5rgrzfkgbfgne.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3a8nljp5rgrzfkgbfgne.png" alt=" " width="800" height="67"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyuuwprwn5l85rtute8pn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyuuwprwn5l85rtute8pn.png" alt=" " width="800" height="272"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the Domain names section, type your domain name.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdo3hy6lacmzbmmmelvaf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdo3hy6lacmzbmmmelvaf.png" alt=" " width="800" height="294"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Validate the Certificate by adding the records in DNS.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcqlcag1t7h5xzobine7s.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcqlcag1t7h5xzobine7s.png" alt=" " width="800" height="127"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fykxh6lqy63idko0a6zrb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fykxh6lqy63idko0a6zrb.png" alt=" " width="800" height="170"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Add the records in Route53&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fixio1l5s3dfcfsvk0ko7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fixio1l5s3dfcfsvk0ko7.png" alt=" " width="800" height="368"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Updating the Controller with ACM&lt;/p&gt;

&lt;p&gt;helm install aws-load-balancer-controller eks/aws-load-balancer-controller \&lt;/p&gt;

&lt;p&gt;-n kube-system \&lt;/p&gt;

&lt;p&gt;--set clusterName=demo-cluster \&lt;/p&gt;

&lt;p&gt;--set serviceAccount.create=false \&lt;/p&gt;

&lt;p&gt;--set serviceAccount.name=aws-load-balancer-controller \&lt;/p&gt;

&lt;p&gt;--set autoDiscoverAwsRegion=true \&lt;/p&gt;

&lt;p&gt;--set autoDiscoverAwsVpcID=true \&lt;/p&gt;

&lt;p&gt;--set region=us-east-1 \&lt;/p&gt;

&lt;p&gt;--set vpcId=vpc-08ce046624ab1b564 \&lt;/p&gt;

&lt;p&gt;--set enableShield=false \ # optional: enable AWS Shield Advanced for the ALB&lt;/p&gt;

&lt;p&gt;--set enableWaf=false \ # optional: enable AWS WAF (Web Application Firewall) for the ALB&lt;/p&gt;

&lt;p&gt;--set acm.enabled=true \ # enable ACM integration&lt;/p&gt;

&lt;p&gt;--set acm.defaultRegion=us-east-1 \ # specify the default region for ACM&lt;/p&gt;

&lt;p&gt;--set acm.managed=false \ # set to false as you're providing the certificate ARN&lt;/p&gt;

&lt;p&gt;--set acm.certArn=arn:aws:acm:us-west-2:XXXXXXXX:certificate/XXXXXX-XXXXXXX-XXXXXXX-XXXXXXXX&lt;/p&gt;

&lt;p&gt;Step Final Testing Phase with ACM by deploying Prom Stack.&lt;/p&gt;

&lt;p&gt;Testing the ingress deploy prom-stack&lt;/p&gt;

&lt;p&gt;helm repo add prometheus-community &lt;a href="https://prometheus-community.github.io/helm-charts" rel="noopener noreferrer"&gt;https://prometheus-community.github.io/helm-charts&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;helm repo update&lt;/p&gt;

&lt;p&gt;helm install prometheus prometheus-community/kube-prometheus-stack --create-namespace -n monitoring&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjsx3277dotu936aoi7br.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjsx3277dotu936aoi7br.png" alt=" " width="800" height="202"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Ingreess rules creation&lt;/p&gt;

&lt;p&gt;Create a ingress file for prometheus&lt;/p&gt;

&lt;p&gt;vi prometheus-ingress.yaml&lt;/p&gt;

&lt;p&gt;Replace the Service and port of the related service.&lt;/p&gt;

&lt;p&gt;apiVersion: networking.k8s.io/v1&lt;/p&gt;

&lt;p&gt;kind: Ingress&lt;/p&gt;

&lt;p&gt;metadata:&lt;/p&gt;

&lt;p&gt;annotations:&lt;/p&gt;

&lt;p&gt;kubernetes.io/ingress.class: nginx&lt;/p&gt;

&lt;p&gt;nginx.ingress.kubernetes.io/ssl-redirect: "true"&lt;/p&gt;

&lt;p&gt;nginx.ingress.kubernetes.io/proxy-body-size: 5m&lt;/p&gt;

&lt;p&gt;name: prod-prom-grafana-monitoring&lt;/p&gt;

&lt;p&gt;namespace: monitoring&lt;/p&gt;

&lt;p&gt;spec:&lt;/p&gt;

&lt;p&gt;rules:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;host: grafana-prod-mumbai.illusto.com&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;http:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; paths:

 - backend:

     service:

       name: prom-stack-grafana

       port:

number: 80

   path: /

   pathType: ImplementationSpecific
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;ul&gt;
&lt;li&gt;host: prom-prod-mumbai.illusto.com&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;http:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; paths:

 - backend:

     service:

       name: prom-stack-kube-prometheus-prometheus

       port:

number: 9090

   path: /

   pathType: ImplementationSpecific
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;ul&gt;
&lt;li&gt;host: alert-prod-mumbai.illusto.com&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;http:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; paths:

 - backend:

     service:

       name: prom-stack-kube-prometheus-alertmanager

       port:

number: 9093

   path: /

   pathType: ImplementationSpecific
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Create the ingress rules&lt;/p&gt;

&lt;p&gt;k apply -f prometheus-ingress.yaml&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd4sfgp4b4n1gum0uwqvu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd4sfgp4b4n1gum0uwqvu.png" alt=" " width="800" height="70"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Check the ingress.&lt;/p&gt;

&lt;p&gt;k get ingress -n monitoring&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdpf7co4p29stpo6lrabw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdpf7co4p29stpo6lrabw.png" alt=" " width="800" height="54"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Add the Route53 records:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdbde8fzt4e9zz1b8gw4w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdbde8fzt4e9zz1b8gw4w.png" alt=" " width="800" height="70"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5eszeewl7poj1bmrgoif.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5eszeewl7poj1bmrgoif.png" alt=" " width="769" height="610"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Grafana Dashboard:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fngjdnelgc4a8d08woljz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fngjdnelgc4a8d08woljz.png" alt=" " width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Prometheus Dashboard:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft8dyz08fo0j3xfk5io6a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft8dyz08fo0j3xfk5io6a.png" alt=" " width="800" height="214"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;AlertManager Dashboard:&lt;/p&gt;

&lt;p&gt;&lt;u&gt;&lt;strong&gt;Scope&lt;/strong&gt;&lt;/u&gt;&lt;/p&gt;

&lt;p&gt;This SOP covers the deployment and management of the AWS Load Balancer Controller in a Kubernetes cluster. It is intended for use in exposing cluster applications to the internet through AWS Elastic Load Balancers. The document applies to system administrators, DevOps engineers, and any team members responsible for maintaining Kubernetes clusters on AWS. It also includes best practices for managing Ingress resources and configuring load balancers to meet security and scalability requirements.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;u&gt;Roles and Responsibilities&lt;/u&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;DevOps Engineers:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ensure the AWS Load Balancer Controller is installed and configured correctly in the Kubernetes cluster.&lt;/li&gt;
&lt;li&gt;Monitor load balancer performance and ensure compliance with the organization’s network and security policies.&lt;/li&gt;
&lt;li&gt;Define and apply Kubernetes Ingress or Service resources with proper annotations for load balancer configuration.&lt;/li&gt;
&lt;li&gt;Automate the deployment of load balancers using Infrastructure as Code (IaC) tools like Terraform or Helm.&lt;/li&gt;
&lt;li&gt;Configure AWS security groups to allow necessary traffic to and from the load balancers.&lt;/li&gt;
&lt;li&gt;Verify that DNS names and IP addresses assigned to the load balancers are correctly pointing to the Kubernetes resources.&lt;/li&gt;
&lt;li&gt;Ensure the setup adheres to security standards and regulatory requirements, including data protection and access control policies.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;&lt;u&gt;Enforcement&lt;/u&gt;&lt;/strong&gt;&lt;br&gt;
This SOP is enforced by regular audits of the Kubernetes and AWS infrastructure to ensure compliance with defined configurations. Automated monitoring tools, such as Prometheus, should be used to track load balancer health, traffic patterns, and security group configurations. Any deviations from the established guidelines should be logged, reported, and addressed promptly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;u&gt;Conclusion&lt;/u&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Implementing the AWS Load Balancer Controller is a critical step in securely and efficiently managing application traffic in a Kubernetes cluster. By following the practices outlined in this SOP, organizations can ensure reliable application delivery while maintaining scalability and security. Proper configuration and monitoring of the controller and associated resources will significantly reduce the risks of downtime, unauthorized access, or performance bottlenecks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;u&gt;Ref Links&lt;/u&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://docs.aws.amazon.com/eks/latest/userguide/aws-load-balancer-controller.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/eks/latest/userguide/aws-load-balancer-controller.html&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/kubernetes-sigs/aws-load-balancer-controller" rel="noopener noreferrer"&gt;https://github.com/kubernetes-sigs/aws-load-balancer-controller&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://docs.aws.amazon.com/eks/latest/userguide/lbc-helm.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/eks/latest/userguide/lbc-helm.html&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://docs.aws.amazon.com/eks/latest/userguide/alb-ingress.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/eks/latest/userguide/alb-ingress.html&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/aws/eks-charts/blob/master/stable/aws-load-balancer-controller/values.yaml" rel="noopener noreferrer"&gt;https://github.com/aws/eks-charts/blob/master/stable/aws-load-balancer-controller/values.yaml&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://docs.aws.amazon.com/eks/latest/userguide/lbc-manifest.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/eks/latest/userguide/lbc-manifest.html&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://docs.aws.amazon.com/eks/latest/userguide/alb-ingress.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/eks/latest/userguide/alb-ingress.html&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.7/guide/ingress/annotations/" rel="noopener noreferrer"&gt;https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.7/guide/ingress/annotations/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/iam-veeramalla/aws-devops-zero-to-hero/blob/main/day-22/2048-app-deploy-ingress.md" rel="noopener noreferrer"&gt;https://github.com/iam-veeramalla/aws-devops-zero-to-hero/blob/main/day-22/2048-app-deploy-ingress.md&lt;/a&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>kubernetes</category>
      <category>networking</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
