Introduction
Modern DevOps projects are not just about making things work once — they are about making systems repeatable, debuggable, and reliable.
While building an end-to-end microservices platform with CI/CD, Kubernetes, ArgoCD, monitoring, and security tools, I first needed a DevOps jumphost: a powerful EC2 instance with all required tooling installed.
This article focuses on:
How I provisioned the jumphost using Terraform
Why my initial cloud-init Bash automation failed
How I migrated to Ansible for configuration management
The real errors I faced and how I fixed them
If you are building DevOps labs, platforms, or production infrastructure, these lessons will save you days of debugging.
**Architecture Overview (brief)
Terraform provisions:**
VPC + Subnets + Security Groups
IAM Role & Instance Profile
EC2 jumphost (Ubuntu)
Ansible configures:
Docker
Jenkins
Terraform
Ansible
kubectl, eksctl
Helm
Trivy
AWS CLI
Databases & other tooling
Terraform handles infrastructure.
Ansible handles configuration.
Bonus: Monitor Your AWS Cost in Real-Time While Building This Project (Using AWS CostWatch)
When you buildin this project your EKS, EC2, ECR, and Jenkins, AWS bills can grow silently. So before you panic after seeing a high bill, you must observe your cost live. That’s why I built AWS CostWatch and I strongly recommend running it locally while doing this project.
This works for:
Free Tier users
Paid AWS users
Students & learners
DevOps engineers testing infrastructure
It runs on your own laptop terminal and keeps monitoring AWS in the background.
Step-by-Step: Run AWS CostWatch on Your Laptop
You don’t need any server. Just your laptop terminal.
1. Clone the repository
git clone: https://github.com/vsaraths/AWS-Cost-Watch.git
cd AWS-Cost-Watch
2. Create a Python virtual environment (recommended)
python3 -m venv venv
source venv/bin/activate
**For Windows:
venv\Scripts\activate
3. Install dependencies
pip install -r requirements.txt
or manually:
pip install boto3 rich sqlite-utils
4. Configure AWS credentials
Make sure this works:
aws sts get-caller-identity`
If not:
aws configure
Provide:
Access Key
Secret Key
Region
Output format: json
5. Run CostWatch
python3 aws_cost_dashboard.py
Once started:
It instantly scans your AWS account
Displays the green live dashboard
Auto-refreshes every 10 minutes
Saves history to SQLite
Leave this terminal open.
It will keep running in background while you work on Terraform, EKS, Jenkins, etc.
So lets start our project
Step 1: Clone the Repository
git clone https://github.com/vsaraths/<your-repo>.git
cd <your-repo>`
Step 2: Configure AWS Credentials
aws configure
Provide:
Access Key ID
Secret Access Key
Region (e.g., us-east-1)
Output format: json
Navigate into the Project
ls
cd Microservices-E-Commerce-eks-project`
ls`
Step 3: Create S3 Backend for Terraform
cd s3-buckets
terraform init
terraform apply -auto-approve`
Explain:
This stores Terraform state remotely and enables safe infra management.
Step 3.1: Create Network Infrastructure
Navigate to Terraform EC2 setup:
cd ../terraform_main_ec2
terraform init
terraform plan
terraform apply -auto-approve`
Sample output:
Apply complete! Resources: 24 added, 0 changed, 0 destroyed.
jumphost_public_ip = "18.208.229.108"
region = "us-east-1"
Check Terraform state:
terraform state list
Step 4: Create the Jumphost EC2 (Terraform + Ansible)
cd terraform_main_ec2
terraform init
terraform apply -auto-approve
Then:
This EC2 acts as the DevOps control plane:
Jenkins
Terraform
kubectl
Ansible
Docker
AWS CLI
Ansible was used to configure all tools instead of shell scripts, making the setup repeatable and idempotent.
Step 5: Connect to EC2 and Access Jenkins
From AWS Console → EC2 → Connect → Switch to root:
sudo -i
Verify installed DevOps tools:
git --version
java -version
jenkins --version
terraform -version
mvn -v
kubectl version --client --short
eksctl version
helm version --short
docker --version
trivy --version
Get Jenkins admin password:
cat /var/lib/jenkins/secrets/initialAdminPassword
example output :
0c39f23132004d508132ae3e0a7c70e4
Step 6: Setup Jenkins
Open:
http://<EC2_PUBLIC_IP>:8080
- Paste admin password
- Install suggested plugins
- Create first user (example: admin)
- Click through: Save and Continue → Save and Finish → Start using Jenkins
Step 7: Install Jenkins Plugins
- Go to Jenkins Dashboard → Manage Jenkins → Plugins.
- Click the Available tab.
- Search and install the following: ✅ Pipeline: stage view
- when installation is compete: Restart jenkins when installation is complete and no job are running
Step 8: Create a Jenkins Pipeline Job (Create EKS Cluster)
1.Go to Jenkins Dashboard
2.Click New Item
3.Name it: eks-terraform
4.Select: Pipeline
5.Click OK
- Pipeline:
- Definition : Pipeline script from SCM
- SCM : Git
- Repositories :
https://github.com/vsaraths/Deploy--E-Commerce-Application-eks-microservices-platform-11-Services-.git - Branches to build : */main
- Script Path : eks-terraform/eks-jenkinsfile
- Apply
- Save
click Build with Parameters
ACTION :
Select Terraform action : apply
Build
To verify your EKS cluster, connect to your EC2 jumphost server and run:
aws eks --region us-east-1 update-kubeconfig --name project-eks
kubectl get nodes
Step 9: Create a Jenkins Pipeline Job (Create Elastic Container Registry (ecr))
1.Go to Jenkins Dashboard
2.Click New Item
3.Name it: ecr-terraform
4.Select: Pipeline
5.Click OK
- Pipeline:
- Definition : Pipeline script from SCM
- SCM : Git
- Repositories : `https://github.com/vsaraths/Deploy--E-Commerce-Application-eks-microservices-platform-11-Services-.git
- `Branches to build : */main
- Script Path : ecr-terraform/ecr-jenkinfile
- Apply
- Save
click Build with Parameters
ACTION :
Select Terraform action : apply
Build
To verify your EKS cluster, connect to your EC2 jumphost server and run:
aws ecr describe-repositories --region us-east-1
Services:
- emailservice
- checkoutservice
- recommendationservice
- frontend
- paymentservice
- productcatalogservice
- cartservice
- loadgenerator
- currencyservice
- shippingservice
- adservice
Step 10: Create a Jenkins Pipeline Job for Build and Push Docker Images to ECR
🔐 Step 10.1: Add GitHub PAT to Jenkins Credentials
1.Navigate to Jenkins Dashboard → Manage Jenkins → 2.Credentials → (global) → Global credentials (unrestricted).
3.Click “Add Credentials”.
- In the form:
- Kind: Secret text
- Secret: ghp_HKMTPOddKYE2LfdLGuytsimfedgdssxnnl5d1f73zh
- ID: my-git-pattoken
- Description: git credentials
- Click “OK” to save.
Step 10.2: Jenkins Pipeline Setup: Build and Push and update Docker Images to ECR
**Step 10.2.1: Jenkins Pipeline Setup: emailservice
**
- Go to Jenkins Dashboard
- Click New Item
- Name it: emailservice
- Select: Pipeline
- Click OK
- Pipeline:
- Definition : Pipeline script from SCM
- SCM : Git
- Repositories :
https://github.com/vsaraths/Deploy--E-Commerce-Application-eks-microservices-platform-11-Services-.git - Branches to build : */main
- Script Path : jenkinsfiles/emailservice
- Apply
- Save
- click Build
*Step 10.2.2: Jenkins Pipeline Setup: checkoutservice
*
- Go to Jenkins Dashboard
- Click New Item
- Name it: checkoutservice
- Select: Pipeline
- Click OK
- Pipeline:
- Definition : Pipeline script from SCM
- SCM : Git
- Repositories :
https://github.com/vsaraths/Deploy--E-Commerce-Application-eks-microservices-platform-11-Services-.git - Branches to build : */master
- Script Path : jenkinsfiles/checkoutservice
- Apply
- Save
- click Build
Step 10.2.3: Jenkins Pipeline Setup: recommendationservice
- Go to Jenkins Dashboard
- Click New Item
- Name it: recommendationservice
- Select: Pipeline
- Click OK
- Pipeline:
- Definition : Pipeline script from SCM
- SCM : Git
- Repositories :
https://github.com/vsaraths/Deploy--E-Commerce-Application-eks-microservices-platform-11-Services-.git - Branches to build : */main
- Script Path : jenkinsfiles/recommendationservice
- Apply
- Save
- click Build
Step 10.2.4: Jenkins Pipeline Setup: frontend
- Go to Jenkins Dashboard
- Click New Item
- Name it: frontend
- Select: Pipeline
- Click OK
- Pipeline:
- Definition : Pipeline script from SCM
- SCM : Git
- Repositories :
https://github.com/vsaraths/Deploy--E-Commerce-Application-eks-microservices-platform-11-Services-.git - Branches to build : */main
- Script Path : jenkinsfiles/frontend
- Apply
- Save
- click Build
*Step 10.2.5: Jenkins Pipeline Setup: paymentservice
*
- Go to Jenkins Dashboard
- Click New Item
- Name it: paymentservice
- Select: Pipeline
- Click OK
- Pipeline:
- Definition : Pipeline script from SCM
- SCM : Git
- Repositories :
https://github.com/vsaraths/Deploy--E-Commerce-Application-eks-microservices-platform-11-Services-.git - Branches to build : */main
- Script Path : jenkinsfiles/paymentservice
- Apply
- Save
- click Build
*Step 10.2.6: Jenkins Pipeline Setup: productcatalogservice
*
- Go to Jenkins Dashboard
- Click New Item
- Name it: productcatalogservice
- Select: Pipeline
- Click OK
- Pipeline:
- Definition : Pipeline script from SCM
- SCM : Git
- Repositories : https://github.com/vsaraths/Deploy--E-Commerce-Application-eks-microservices-platform-11-Services-.git
- Branches to build : */main
- Script Path : jenkinsfiles/productcatalogservice
- Apply
- Save
- click Build
Step 10.2.7: Jenkins Pipeline Setup: cartservice
- Go to Jenkins Dashboard
- Click New Item
- Name it: cartservice
- Select: Pipeline
- Click OK
- Pipeline:
- Definition : Pipeline script from SCM
- SCM : Git
- Repositories :
https://github.com/vsaraths/Deploy--E-Commerce-Application-eks-microservices-platform-11-Services-.git - Branches to build : */main
- Script Path : jenkinsfiles/cartservice
- Apply
- Save
- click Build
Step 10.2.8: Jenkins Pipeline Setup: loadgenerator
- Go to Jenkins Dashboard
- Click New Item
- Name it: loadgenerator
- Select: Pipeline
- Click OK
- Pipeline:
- Definition : Pipeline script from SCM
- SCM : Git
- Repositories : `https://github.com/vsaraths/Deploy--E-Commerce-Application-eks-microservices-platform-11-Services-.git
- `Branches to build : */main
- Script Path : jenkinsfiles/loadgenerator
- Apply
- Save
- click Build
Step 10.2.9: Jenkins Pipeline Setup: currencyservice
- Go to Jenkins Dashboard
- Click New Item
- Name it: currencyservice
- Select: Pipeline
- Click OK
- Pipeline:
- Definition : Pipeline script from SCM
- SCM : Git
- Repositories : https://github.com/vsaraths/Deploy--E-Commerce-Application-eks-microservices-platform-11-Services-.git
- Branches to build : */main
- Script Path : jenkinsfiles/currencyservice
- Apply
- Save
- click Build
Step 10.2.10: Jenkins Pipeline Setup: shippingservice
- Go to Jenkins Dashboard
- Click New Item
- Name it: shippingservice
- Select: Pipeline
- Click OK
- Pipeline:
- Definition : Pipeline script from SCM
- SCM : Git
- Repositories : https://github.com/vsaraths/Deploy--E-Commerce-Application-eks-microservices-platform-11-Services-.git
- Branches to build : */main
- Script Path : jenkinsfiles/shippingservice
- Apply
- Save
- click Build
Step 10.2.11: Jenkins Pipeline Setup: adservice
- Go to Jenkins Dashboard
- Click New Item
- Name it: adservice
- Select: Pipeline
- Click OK
- Pipeline:
- Definition : Pipeline script from SCM
- SCM : Git
- Repositories :
https://github.com/vsaraths/Deploy--E-Commerce-Application-eks-microservices-platform-11-Services-.git - Branches to build : */main
- Script Path : jenkinsfiles/adservice
- Apply
- Save
- click Build
🖥️ Step 11: Install ArgoCD in Jumphost EC2
11.1: Create Namespace for ArgoCD
kubectl create namespace argocd
11.2: Install ArgoCD in the Created Namespace
kubectl apply -n argocd \
-f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
11.3: Verify the Installation
Ensure all pods are in Running state.
kubectl get pods -n argocd
11.4: Validate the Cluster
Check your nodes and create a test pod if necessary:
kubectl get nodes
11.5: List All ArgoCD Resources
kubectl get all -n argocd
Sample output:
NAME READY STATUS RESTARTS AGE
pod/argocd-application-controller-0 1/1 Running 0 106m
pod/argocd-applicationset-controller-787bfd9669-4mxq6 1/1 Running 0 106m
pod/argocd-dex-server-bb76f899c-slg7k 1/1 Running 0 106m
pod/argocd-notifications-controller-5557f7bb5b-84cjr 1/1 Running 0 106m
pod/argocd-redis-b5d6bf5f5-482qq 1/1 Running 0 106m
pod/argocd-repo-server-56998dcf9c-c75wk 1/1 Running 0 106m
pod/argocd-server-5985b6cf6f-zzgx8 1/1 Running 0 106m
11.6: Expose ArgoCD Server Using LoadBalancer
11.6.1: Edit the ArgoCD Server Service
kubectl edit svc argocd-server -n argocd
11.6.2: Change the Service Type
Find this line:
type: ClusterIP
Change it to:
type: LoadBalancer
Save and exit (:wq for vi).
11.6.3: Get the External Load Balancer DNS
kubectl get svc argocd-server -n argocd
Sample output:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
argocd-server LoadBalancer 172.20.1.100 a1b2c3d4e5f6.elb.amazonaws.com 80:31234/TCP,443:31356/TCP 2m
11.6.4: Access the ArgoCD UI
Use the DNS:
https://<EXTERNAL-IP>.amazonaws.com
11.7: 🔐 Get the Initial ArgoCD Admin Password
kubectl get secret argocd-initial-admin-secret -n argocd \
-o jsonpath="{.data.password}" | base64 -d && echo
Login Details:
Username: admin
Password: (The output of the above command)
Step 12: Deploying with ArgoCD and Configuring Route 53 (Step-by-Step)
Step 12.1: Create Namespace in EKS (from Jumphost EC2)
Run these commands on your jumphost EC2 server:
kubectl create namespace dev
kubectl get namespaces
Step 12.2: Create New Applicatio with ArgoCD
- Open the ArgoCD UI in your browser.
- Click + NEW APP.
- Fill in the following:
- Application Name: project
- Project Name: default
- Sync Policy: Automatic
- Repository URL:
https://github.com/vsaraths/Deploy--E-Commerce-Application-eks-microservices-platform-11-Services-.git - Revision: HEAD
- Path: kubernetes-files
- Cluster URL: https://kubernetes.default.svc
- Namespace: dev
- Click Create.
Step 17: Create a Jenkins Pipeline Job for Backend and frondend & Route 53 Setup
Enable HTTPS for vsarath.site with AWS Classic Load Balancer (CLB)
This guide explains how to configure HTTPS for your domain aluru.site using AWS Classic Load Balancer (CLB), Route 53, and AWS Certificate Manager (ACM).
✅ Prerequisites
- A working application (e.g., on EC2 or Kubernetes).
- A registered domain: aluru.site
- Domain is managed in Route 53 as a Public Hosted Zone.
- Go to AWS Route 53
- Create a Hosted Zone:
- Domain: aluru.site
- Type: Public Hosted Zone
- Update Hostinger Nameservers:
Paste the 4 NS records from Route 53 into Hostinger:
- ns-865.awsdns-84.net
- ns-1995.awsdns-97.co.uk
- ns-1418.awsdns-59.org
- ns-265.awsdns-73.com Your Classic Load Balancer is running and serving HTTP on port 80 or 8080.
Step 1: Request a Public Certificate in ACM
- Go to AWS Certificate Manager (ACM).
- Click Request Certificate.
- Choose Request a Public Certificate.
- Enter domain:
- vsarath.site
- www.vsarath.site (optional)
- Choose DNS validation.
Press enter or click to view image in full size
- Click Request.
- After request:
- Click Create DNS record in Route 53.
- ACM will create the _acme-challenge CNAME record.
- Wait a few minutes until status becomes Issued.
Step 2: Add HTTPS Listener to CLB
- Go to EC2 Console > Load Balancers.
- Select your Classic Load Balancer.
- Go to Listeners tab.
- Click Add Listener (or edit existing 443):
- Protocol: HTTPS
- Load Balancer Port: 443
- Instance Protocol: HTTP (or HTTPS if applicable)
- Instance Port: 80 (or 8080 if your app runs there)
- SSL Certificate: Choose the one for aluru.site
- Security Policy: Select ELBSecurityPolicy-2021–06
- Click Save.
Step 3: Update Security Group Rules
Go to your EC2 or Load Balancer Security Group:
- Add Inbound Rule:
- Type: HTTPS
- Protocol: TCP
- Port: 443
- Source: 0.0.0.0/0
Ensure existing rules allow HTTP (port 80) or your backend port.
Step 4: Configure DNS in Route 53
- In ArgoCD UI, open your project application.
- Click on frontend and copy the hostname (e.g., acfb06fba08834577a50e43724d328e3-1568967602.us-east-1.elb.amazonaws.com).
- Go to Route 53 > Hosted Zones.
- Select aluru.site.
- Click Create Record:
- Record name: leave blank (for root domain)
- Record type: A — Routes traffic to an IPv4 address and AWS resource
- Alias: Yes
- Alias target: Choose Application and Classic Load Balancer
- Region: US East (N. Virginia)
- Alias target value: Paste the frontend load balancer DNS (from step 2)
- Click Create Record.
Step 5: Test Your Setup
Using Browser
Visit:
https://vsarathsite
You should see your application load securely over HTTPS.
Using curl
curl -v https://vsarath.site
Expect HTTP 200 OK or the actual page content.
if everything is okay you will see the below website
If no DNS or domain you can still access the website with loadbalancer link
Troubleshooting
HTTPS times out?
- Check port 443 is open in Security Group.
- Make sure your app is reachable from the CLB.
- ACM certificate must be in Issued status.
- HTTP works but HTTPS doesn’t?
Listener or certificate may not be configured properly.
Check the load balancer health check passes.
Monitoring Steps
Kubernetes Monitoring: Step-by-Step with Prometheus & Grafana
Install Prometheus + Grafana to Monitor Your K8s Cluster
📊 Monitor your Argo CD–deployed website (running via LoadBalancer) — with Prometheus + Grafana
🔧 View CPU, RAM, pod status, uptime, errors, etc.
🧰 Prerequisites (Before We Start)
Make sure you have these ready 👇
1️. A Kubernetes Cluster (EKS, GKE, Minikube — anything works)
2️. kubectl is installed and connected to your cluster
3️. Helm is installed (helm version)
Install Helm (if not installed)
curl https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 | bash
helm version
4️. Internet access to pull charts & Docker images
5️. (Optional) Argo CD if you want GitOps deployment
If you’re using GitOps, ensure:
☑ Argo CD is already deployed
☑ Your app is deployed using Argo CD
☑ Access to the app via Load Balancer
1️⃣ Create a Namespace for Monitoring
kubectl create namespace monitoring
2️⃣ Add Prometheus & Grafana Helm Chart Repo
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
3️⃣ Install the Kube Prometheus Stack (Includes Prometheus + Grafana)
helm install kube-prom-stack prometheus-community/kube-prometheus-stack \
--namespace monitoring
🛠️ This installs:
Prometheus (metrics collector)
Grafana (dashboard visualizer)
Alertmanager (for warnings)
Node exporters (to get node metrics)
4️⃣Check That Everything Is Runnin
kubectl get pods -n monitoring
You will get output like this :
NAME READY STATUS RESTARTS AGE
alertmanager-kube-prom-stack-kube-prome-alertmanager-0 2/2 Running 0 2m45s
kube-prom-stack-grafana-d5dfd9fd-m5j9t 3/3 Running 0 3m19s
kube-prom-stack-kube-prome-operator-6779bc5685-llmc8 1/1 Running 0 3m19s
kube-prom-stack-kube-state-metrics-6c4dc9d54-w48xj 1/1 Running 0 3m19s
kube-prom-stack-prometheus-node-exporter-vhncz 1/1 Running 0 3m19s
kube-prom-stack-prometheus-node-exporter-vx56f 1/1 Running 0 3m19s
prometheus-kube-prom-stack-kube-prome-prometheus-0 2/2 Running 0 2m45s
✅ Wait until STATUS is Running.
5️⃣. Accessing the Grafana UI Using LoadBalancer
Prometheus stack exposes Grafana as an internal service by default. Let’s expose it to the world 🌍.
Edit the Grafana Server File
kubectl edit svc kube-prom-stack-grafana -n monitoring
Change the Service Type :
Find this line:
type: ClusterIP
Change it to:
type: LoadBalancer
Save and exit (:wq for vi).
6️⃣. Get the Grafana LoadBalancer IP
kubectl get svc kube-prom-stack-grafana -n monitoring
You will get output like this :
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-prom-stack-grafana LoadBalancer 172.20.174.208 abbda6b6f6c9345c6b017c020cf00122-1809047356.us-east-1.elb.amazonaws.com 80:32242/TCP 5m39s
📌 Copy the EXTERNAL-IP, it will look like: http://a1b2c3d4.us-east-1.elb.amazonaws.com
7️⃣Accessing the Grafana UI
Open Your Browser
Copy the EXTERNAL-IP (the long .elb.amazonaws.com address).
Paste it into your browser:
You’ll see the Grafana login page!
- 🔐 Get the Initial Grafana Admin Password
kubectl get secret kube-prom-stack-grafana -n monitoring -o jsonpath="{.data.admin-password}" | base64 -d && echo
You will get output like this :
prom-operator
Login to Grafana :
Go to your LoadBalancer IP in browser.
Default credentials:
Username: admin
Password: prom-operator
👉 Change the password when prompted.
You will like this
8️⃣ Add Kubernetes Dashboards in Grafana
📊 Go to:
Left menu → Dashboards → + Import → New Dashboard
Paste Dashboard ID:
In the text box under “Import via Grafana.com”, paste this number:
Otherwise you can use Json Files in this repository You can download Jason files and you can upload In the dashboard.
Become a member
Kubernetes Cluster Monitoring (ID: 315)
Kubernetes Pods/Containers (ID: 3662)
Kubernetes Deployments (ID: 1621)
Kubernetes API Server (ID: 12006)
Kubernetes Nodes (ID: 6417)
Kubernetes Namespace Monitoring (ID: 10000)
Kubernetes Persistent Volumes (ID: 13602)
Kubernetes Networking (ID: 15758)
NGINX Ingress Controller (ID: 9614)
Click Load
Grafana expects one dashboard ID at a time in the “Grafana.com dashboard URL or ID” field.
For example:
- You can enter 315, click Load, and then import that dashboard.
- You must repeat this for each dashboard ID (e.g., 3662, 1621, etc.).
2. Select Data source
On the next screen:
You’ll see a dropdown Prometheus Data Source.
Choose Prometheus (it’s already installed with kube-prometheus-stack).
Then click Import.
3. View the Dashboard
After importing, the dashboard will automatically open.
You’ll now see:
CPU usage per Node (which node is using the most CPU).
Memory usage per Pod (how much RAM each pod is using).
Cluster Uptime.
Requests, errors, etc.
9️⃣ See Your Argo CD App Metrics!
All apps running in your cluster (including ones deployed via Argo CD) are automatically monitored!
You can import the Argo CD dashboard from Grafana.com:
Go to Dashboards → Import.
Use Dashboard ID 14584 (Argo CD Official Dashboard).
Select Prometheus as the data source.
[Optional] Configure Alerts (Email Notifications
When CPU usage (or any other metric) goes beyond a threshold, you’ll receive an email alert from Alertmanager (part of Prometheus stack).
⓵ Confirm Alertmanager is Running
After installing the kube-prometheus-stack:
kubectl get pods -n monitoring
Look for something like:
alertmanager-kube-prom-stack-kube-prome-alertmanager-0
✅ If it’s running, you’re good to go.
⓶ The Alertmanager dashboard provides:
Active Alerts — A list of alerts currently firing (e.g., high CPU usage).
Silences — You can configure silences to suppress certain alerts.
Status — Displays cluster and configuration status.
Receivers — Configured receivers like email, Slack, etc.
Routes — The routing tree for alert notifications.
⓷ Edit the Alertmanager Config Ma
The configuration for Alertmanager is stored in a ConfigMap.
Run:
kubectl get pods -n monitoring | grep alertmanage
Ensure:
alertmanager-kube-prom-stack-kube-prome-alertmanager-0 2/2 Running
✅ To Access Alertmanager via Load Balancer (Externally):
You need to change the service type from ClusterIP to LoadBalancer.
🔧 Step 1: Edit the Alertmanager Service
Run:
kubectl edit svc kube-prom-stack-kube-prome-alertmanager -n monitoring
🔄 Step 2: Change This Line:
Find:
type: ClusterIP
Change it to:
type: LoadBalancer
Save and exit (:wq if using vim)
⏳ Step 3: Wait for External IP
Check again with:
kubectl get svc kube-prom-stack-kube-prome-alertmanager -n monitorin
It will show something like:
kube-prom-stack-kube-prome-alertmanager LoadBalancer 172.20.51.43 abc123456789.elb.amazonaws.com 9093:xxxxx/TCP ...
✅ Copy the DNS under EXTERNAL-IP
🌐 Step 4: Access Alertmanager in Browser
Use:
http://<external-dns>:9093
Example:
http://abc123456789.elb.amazonaws.com:9093
✅ Optional: Open Port 9093 in Security Group
If it doesn’t load:
Go to AWS EC2 Console → Load Balancers
Find the Alertmanager ELB
Open the Security Group
Edit Inbound Rules:
Add rule for TCP 9093
Source: 0.0.0.0/0 (or your IP)
run this commends
kubectl get secret alertmanager-kube-prom-stack-kube-prome-alertmanager -n monitoring -o jsonpath='{.data.alertmanager\.yaml}' | base64 --decode > alertmanager.yaml
ls
vim alertmanager.yaml
⓸ Add Email Configuration
Inside the alertmanager.yml section, add your SMTP email settings:
(Replace with your email SMTP provider details.)
global:
smtp_smarthost: 'smtp.gmail.com:587' # Your SMTP server
smtp_from: 'yaswanth.arumulla@hmail.com' # Sender email
smtp_auth_username: 'yaswanth.arumulla@gmai.com'
smtp_auth_password: 'your-app-password' # Use app password (not your real password!)
route:
receiver: 'email-alert'
receivers:
- name: 'email-alert'
email_configs:
- to: 'yaswanth.arumulla@gmail.com' # Where to send alerts
send_resolved: true
⚠️ For Gmail:
Enable “Less Secure Apps” or create an App Password from Google account security settings.
kubectl create secret generic alertmanager-kube-prom-stack-kube-prome-alertmanager \
--from-file=alertmanager.yaml \
-n monitoring \
--dry-run=client -o yaml | kubectl apply -f -
⓹ Save and Restart Alertmanager
After editing, restart the Alertmanager pod to apply changes:
kubectl delete pod alertmanager-kube-prom-stack-kube-prome-alertmanager-0 -n monitorin
Wait for Restart
kubectl get pods -n monitoring -w
Look for:
alertmanager-kube-prom-stack-kube-prome-alertmanager-0 2/2 Running 0 30s
⓺ Create an Alert Rule (CPU Example)
We’ll add an alert rule to trigger an email when CPU > 70%.
Create a new YAML file called cpu-alert-rule.yaml:
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: cpu-alert
namespace: monitoring
spec:
groups:
- name: cpu.rules
rules:
- alert: HighCPUUsage
expr: sum(rate(container_cpu_usage_seconds_total[1m])) > 0.7
for: 2m
labels:
severity: warning
annotations:
summary: "High CPU Usage detected"
description: "CPU usage is above 70% for 2 minutes."
Apply the rule:
kubectl apply -f cpu-alert-rule.yaml
⓻ Test Your Alert
Run a CPU-heavy process in a pod (simulate load).
Wait 2–3 minutes.
Check your email — you should receive an alert!
You can also see alerts in the Prometheus UI:
Step 1: Edit the Prometheus Service
Run:
kubectl edit svc kube-prom-stack-kube-prome-prometheus -n monitoring
Change the Service Type :
Find this line:
type: ClusterIP
Change it to:
type: LoadBalancer
Save and exit (:wq for vi).
Get the Prometheus LoadBalancer IP
kubectl get svc kube-prom-stack-kube-prome-prometheus -n monitoring
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-prom-stack-kube-prome-prometheus LoadBalancer 172.20.146.205 a17530e645b134734ba1cff112072526-1666914053.us-east-1.elb.amazonaws.com 9090:32606/TCP,8080:31797/TCP 3h35m
📌 Copy the EXTERNAL-IP, it will look like: http://a1b2c3d4.us-east-1.elb.amazonaws.com:9090
Done!
Now, you’ll get email alerts whenever CPU usage crosses the limit.
🎉 Final Checklist
✅ Prometheus & Grafana Installed
✅ Grafana Accessible via LoadBalancer
✅ Kubernetes Metrics Visible
✅ Argo CD Deployed App Visible
✅ Dashboards Working
✅ Optional Alerts Configured
Bonus: What You Can Monitor
✅ CPU/RAM of your Argo CD app
✅ Pod crashes/restarts
✅ Node health
✅ Cluster capacity
✅ Response times
✅ Resource usage per container



Top comments (0)