DEV Community

Cover image for From Terraform to GitOps: Building an End-to-End DevOps Platform with 11 Microservices
Sarath V
Sarath V Subscriber

Posted on

From Terraform to GitOps: Building an End-to-End DevOps Platform with 11 Microservices

Introduction

Modern DevOps projects are not just about making things work once — they are about making systems repeatable, debuggable, and reliable.

While building an end-to-end microservices platform with CI/CD, Kubernetes, ArgoCD, monitoring, and security tools, I first needed a DevOps jumphost: a powerful EC2 instance with all required tooling installed.

This article focuses on:

How I provisioned the jumphost using Terraform
Why my initial cloud-init Bash automation failed
How I migrated to Ansible for configuration management
The real errors I faced and how I fixed them

If you are building DevOps labs, platforms, or production infrastructure, these lessons will save you days of debugging.
**Architecture Overview (brief)

Terraform provisions:**
VPC + Subnets + Security Groups
IAM Role & Instance Profile
EC2 jumphost (Ubuntu)

Ansible configures:
Docker
Jenkins
Terraform
Ansible
kubectl, eksctl
Helm
Trivy
AWS CLI
Databases & other tooling
Terraform handles infrastructure.
Ansible handles configuration.

Bonus: Monitor Your AWS Cost in Real-Time While Building This Project (Using AWS CostWatch)
When you buildin this project your EKS, EC2, ECR, and Jenkins, AWS bills can grow silently. So before you panic after seeing a high bill, you must observe your cost live. That’s why I built AWS CostWatch and I strongly recommend running it locally while doing this project.

This works for:

Free Tier users
Paid AWS users
Students & learners
DevOps engineers testing infrastructure

It runs on your own laptop terminal and keeps monitoring AWS in the background.

Step-by-Step: Run AWS CostWatch on Your Laptop

You don’t need any server. Just your laptop terminal.

1. Clone the repository

git clone:  https://github.com/vsaraths/AWS-Cost-Watch.git
cd AWS-Cost-Watch
Enter fullscreen mode Exit fullscreen mode

2. Create a Python virtual environment (recommended)

python3 -m venv venv
source venv/bin/activate
Enter fullscreen mode Exit fullscreen mode

**For Windows:

venv\Scripts\activate
Enter fullscreen mode Exit fullscreen mode

3. Install dependencies

pip install -r requirements.txt
Enter fullscreen mode Exit fullscreen mode

or manually:

pip install boto3 rich sqlite-utils
Enter fullscreen mode Exit fullscreen mode

4. Configure AWS credentials
Make sure this works:

aws sts get-caller-identity`
Enter fullscreen mode Exit fullscreen mode

If not:

aws configure
Enter fullscreen mode Exit fullscreen mode

Provide:
Access Key
Secret Key
Region
Output format: json

5. Run CostWatch

python3 aws_cost_dashboard.py
Enter fullscreen mode Exit fullscreen mode

Once started:

It instantly scans your AWS account
Displays the green live dashboard
Auto-refreshes every 10 minutes
Saves history to SQLite
Leave this terminal open.
It will keep running in background while you work on Terraform, EKS, Jenkins, etc.

So lets start our project

Step 1: Clone the Repository

git clone https://github.com/vsaraths/<your-repo>.git
cd <your-repo>`
Enter fullscreen mode Exit fullscreen mode

Step 2: Configure AWS Credentials

aws configure
Enter fullscreen mode Exit fullscreen mode

Provide:

Access Key ID
Secret Access Key
Region (e.g., us-east-1)
Output format: json
Enter fullscreen mode Exit fullscreen mode

Navigate into the Project

ls
cd Microservices-E-Commerce-eks-project`
ls`
Enter fullscreen mode Exit fullscreen mode

Step 3: Create S3 Backend for Terraform

cd s3-buckets
terraform init
terraform apply -auto-approve`
Enter fullscreen mode Exit fullscreen mode

Explain:

This stores Terraform state remotely and enables safe infra management.

Step 3.1: Create Network Infrastructure

Navigate to Terraform EC2 setup:

cd ../terraform_main_ec2
terraform init
terraform plan
terraform apply -auto-approve`
Enter fullscreen mode Exit fullscreen mode

Sample output:

Apply complete! Resources: 24 added, 0 changed, 0 destroyed.
jumphost_public_ip = "18.208.229.108"
region = "us-east-1"
Enter fullscreen mode Exit fullscreen mode

Check Terraform state:

terraform state list
Enter fullscreen mode Exit fullscreen mode

Step 4: Create the Jumphost EC2 (Terraform + Ansible)

cd terraform_main_ec2
terraform init
terraform apply -auto-approve
Enter fullscreen mode Exit fullscreen mode

Then:

This EC2 acts as the DevOps control plane:
Jenkins
Terraform
kubectl
Ansible
Docker
AWS CLI

Ansible was used to configure all tools instead of shell scripts, making the setup repeatable and idempotent.

Step 5: Connect to EC2 and Access Jenkins

From AWS Console → EC2 → Connect → Switch to root:

sudo -i
Enter fullscreen mode Exit fullscreen mode

Verify installed DevOps tools:

git --version
java -version
jenkins --version
terraform -version
mvn -v
kubectl version --client --short
eksctl version
helm version --short
docker --version
trivy --version
Enter fullscreen mode Exit fullscreen mode

Get Jenkins admin password:

cat /var/lib/jenkins/secrets/initialAdminPassword
Enter fullscreen mode Exit fullscreen mode

example output :

0c39f23132004d508132ae3e0a7c70e4
Enter fullscreen mode Exit fullscreen mode

Step 6: Setup Jenkins
Open:

http://<EC2_PUBLIC_IP>:8080
Enter fullscreen mode Exit fullscreen mode
  • Paste admin password
  • Install suggested plugins
  • Create first user (example: admin)
  • Click through: Save and Continue → Save and Finish → Start using Jenkins

Step 7: Install Jenkins Plugins

  1. Go to Jenkins Dashboard → Manage Jenkins → Plugins.
  2. Click the Available tab.
  3. Search and install the following: ✅ Pipeline: stage view
  4. when installation is compete: Restart jenkins when installation is complete and no job are running

Step 8: Create a Jenkins Pipeline Job (Create EKS Cluster)

1.Go to Jenkins Dashboard
2.Click New Item
3.Name it: eks-terraform
4.Select: Pipeline
5.Click OK

  • Pipeline:
  • Definition : Pipeline script from SCM
  • SCM : Git
  • Repositories : https://github.com/vsaraths/Deploy--E-Commerce-Application-eks-microservices-platform-11-Services-.git
  • Branches to build : */main
  • Script Path : eks-terraform/eks-jenkinsfile
  • Apply
  • Save
  • click Build with Parameters

  • ACTION :

  • Select Terraform action : apply

  • Build

To verify your EKS cluster, connect to your EC2 jumphost server and run:

aws eks --region us-east-1 update-kubeconfig --name project-eks
kubectl get nodes
Enter fullscreen mode Exit fullscreen mode

Step 9: Create a Jenkins Pipeline Job (Create Elastic Container Registry (ecr))
1.Go to Jenkins Dashboard
2.Click New Item
3.Name it: ecr-terraform
4.Select: Pipeline
5.Click OK

aws ecr describe-repositories --region us-east-1
Enter fullscreen mode Exit fullscreen mode

Services:

  • emailservice
  • checkoutservice
  • recommendationservice
  • frontend
  • paymentservice
  • productcatalogservice
  • cartservice
  • loadgenerator
  • currencyservice
  • shippingservice
  • adservice

Step 10: Create a Jenkins Pipeline Job for Build and Push Docker Images to ECR
🔐 Step 10.1: Add GitHub PAT to Jenkins Credentials

1.Navigate to Jenkins Dashboard → Manage Jenkins → 2.Credentials → (global) → Global credentials (unrestricted).
3.Click “Add Credentials”.

  • In the form:
  • Kind: Secret text
  • Secret: ghp_HKMTPOddKYE2LfdLGuytsimfedgdssxnnl5d1f73zh
  • ID: my-git-pattoken
  • Description: git credentials
  • Click “OK” to save.

Step 10.2: Jenkins Pipeline Setup: Build and Push and update Docker Images to ECR
**
Step 10.2.1: Jenkins Pipeline Setup: emailservice
**

  1. Go to Jenkins Dashboard
  2. Click New Item
  3. Name it: emailservice
  4. Select: Pipeline
  5. Click OK
  • Pipeline:
  • Definition : Pipeline script from SCM
  • SCM : Git
  • Repositories : https://github.com/vsaraths/Deploy--E-Commerce-Application-eks-microservices-platform-11-Services-.git
  • Branches to build : */main
  • Script Path : jenkinsfiles/emailservice
  • Apply
  • Save
  • click Build

*Step 10.2.2: Jenkins Pipeline Setup: checkoutservice
*

  1. Go to Jenkins Dashboard
  2. Click New Item
  3. Name it: checkoutservice
  4. Select: Pipeline
  5. Click OK
  • Pipeline:
  • Definition : Pipeline script from SCM
  • SCM : Git
  • Repositories :https://github.com/vsaraths/Deploy--E-Commerce-Application-eks-microservices-platform-11-Services-.git
  • Branches to build : */master
  • Script Path : jenkinsfiles/checkoutservice
  • Apply
  • Save
  • click Build

Step 10.2.3: Jenkins Pipeline Setup: recommendationservice

  1. Go to Jenkins Dashboard
  2. Click New Item
  3. Name it: recommendationservice
  4. Select: Pipeline
  5. Click OK
  • Pipeline:
  • Definition : Pipeline script from SCM
  • SCM : Git
  • Repositories : https://github.com/vsaraths/Deploy--E-Commerce-Application-eks-microservices-platform-11-Services-.git
  • Branches to build : */main
  • Script Path : jenkinsfiles/recommendationservice
  • Apply
  • Save
  • click Build

Step 10.2.4: Jenkins Pipeline Setup: frontend

  1. Go to Jenkins Dashboard
  2. Click New Item
  3. Name it: frontend
  4. Select: Pipeline
  5. Click OK
  6. Pipeline:
  7. Definition : Pipeline script from SCM
  8. SCM : Git
  9. Repositories : https://github.com/vsaraths/Deploy--E-Commerce-Application-eks-microservices-platform-11-Services-.git
  10. Branches to build : */main
  11. Script Path : jenkinsfiles/frontend
  12. Apply
  13. Save
  14. click Build

*Step 10.2.5: Jenkins Pipeline Setup: paymentservice
*

  1. Go to Jenkins Dashboard
  2. Click New Item
  3. Name it: paymentservice
  4. Select: Pipeline
  5. Click OK
  6. Pipeline:
  7. Definition : Pipeline script from SCM
  8. SCM : Git
  9. Repositories : https://github.com/vsaraths/Deploy--E-Commerce-Application-eks-microservices-platform-11-Services-.git
  10. Branches to build : */main
  11. Script Path : jenkinsfiles/paymentservice
  12. Apply
  13. Save
  14. click Build

*Step 10.2.6: Jenkins Pipeline Setup: productcatalogservice
*

  1. Go to Jenkins Dashboard
  2. Click New Item
  3. Name it: productcatalogservice
  4. Select: Pipeline
  5. Click OK
  6. Pipeline:
  7. Definition : Pipeline script from SCM
  8. SCM : Git
  9. Repositories : https://github.com/vsaraths/Deploy--E-Commerce-Application-eks-microservices-platform-11-Services-.git
  10. Branches to build : */main
  11. Script Path : jenkinsfiles/productcatalogservice
  12. Apply
  13. Save
  14. click Build

Step 10.2.7: Jenkins Pipeline Setup: cartservice

  1. Go to Jenkins Dashboard
  2. Click New Item
  3. Name it: cartservice
  4. Select: Pipeline
  5. Click OK
  • Pipeline:
  • Definition : Pipeline script from SCM
  • SCM : Git
  • Repositories : https://github.com/vsaraths/Deploy--E-Commerce-Application-eks-microservices-platform-11-Services-.git
  • Branches to build : */main
  • Script Path : jenkinsfiles/cartservice
  • Apply
  • Save
  • click Build

Step 10.2.8: Jenkins Pipeline Setup: loadgenerator

  1. Go to Jenkins Dashboard
  2. Click New Item
  3. Name it: loadgenerator
  4. Select: Pipeline
  5. Click OK

Step 10.2.9: Jenkins Pipeline Setup: currencyservice

  1. Go to Jenkins Dashboard
  2. Click New Item
  3. Name it: currencyservice
  4. Select: Pipeline
  5. Click OK

Step 10.2.10: Jenkins Pipeline Setup: shippingservice

  1. Go to Jenkins Dashboard
  2. Click New Item
  3. Name it: shippingservice
  4. Select: Pipeline
  5. Click OK

Step 10.2.11: Jenkins Pipeline Setup: adservice

  1. Go to Jenkins Dashboard
  2. Click New Item
  3. Name it: adservice
  4. Select: Pipeline
  5. Click OK
  • Pipeline:
  • Definition : Pipeline script from SCM
  • SCM : Git
  • Repositories : https://github.com/vsaraths/Deploy--E-Commerce-Application-eks-microservices-platform-11-Services-.git
  • Branches to build : */main
  • Script Path : jenkinsfiles/adservice
  • Apply
  • Save
  • click Build

🖥️ Step 11: Install ArgoCD in Jumphost EC2
11.1: Create Namespace for ArgoCD

kubectl create namespace argocd
Enter fullscreen mode Exit fullscreen mode

11.2: Install ArgoCD in the Created Namespace

kubectl apply -n argocd \
  -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
Enter fullscreen mode Exit fullscreen mode

11.3: Verify the Installation

Ensure all pods are in Running state.

kubectl get pods -n argocd
Enter fullscreen mode Exit fullscreen mode

11.4: Validate the Cluster

Check your nodes and create a test pod if necessary:

kubectl get nodes

Enter fullscreen mode Exit fullscreen mode

11.5: List All ArgoCD Resources

kubectl get all -n argocd
Enter fullscreen mode Exit fullscreen mode

Sample output:

NAME READY STATUS RESTARTS AGE
pod/argocd-application-controller-0 1/1 Running 0 106m
pod/argocd-applicationset-controller-787bfd9669-4mxq6 1/1 Running 0 106m
pod/argocd-dex-server-bb76f899c-slg7k 1/1 Running 0 106m
pod/argocd-notifications-controller-5557f7bb5b-84cjr 1/1 Running 0 106m
pod/argocd-redis-b5d6bf5f5-482qq 1/1 Running 0 106m
pod/argocd-repo-server-56998dcf9c-c75wk 1/1 Running 0 106m
pod/argocd-server-5985b6cf6f-zzgx8 1/1 Running 0 106m

11.6: Expose ArgoCD Server Using LoadBalancer

11.6.1: Edit the ArgoCD Server Service

kubectl edit svc argocd-server -n argocd
Enter fullscreen mode Exit fullscreen mode

11.6.2: Change the Service Type

Find this line:

type: ClusterIP
Enter fullscreen mode Exit fullscreen mode

Change it to:

type: LoadBalancer
Enter fullscreen mode Exit fullscreen mode

Save and exit (:wq for vi).

11.6.3: Get the External Load Balancer DNS

kubectl get svc argocd-server -n argocd
Enter fullscreen mode Exit fullscreen mode

Sample output:

NAME            TYPE           CLUSTER-IP     EXTERNAL-IP                           PORT(S)                          AGE
argocd-server   LoadBalancer   172.20.1.100   a1b2c3d4e5f6.elb.amazonaws.com        80:31234/TCP,443:31356/TCP       2m
Enter fullscreen mode Exit fullscreen mode

11.6.4: Access the ArgoCD UI

Use the DNS:

https://<EXTERNAL-IP>.amazonaws.com
Enter fullscreen mode Exit fullscreen mode

11.7: 🔐 Get the Initial ArgoCD Admin Password

kubectl get secret argocd-initial-admin-secret -n argocd \
  -o jsonpath="{.data.password}" | base64 -d && echo
Login Details:
Enter fullscreen mode Exit fullscreen mode

Username: admin
Password: (The output of the above command)

Step 12: Deploying with ArgoCD and Configuring Route 53 (Step-by-Step)
Step 12.1: Create Namespace in EKS (from Jumphost EC2)

Run these commands on your jumphost EC2 server:

kubectl create namespace dev
kubectl get namespaces
Enter fullscreen mode Exit fullscreen mode

Step 12.2: Create New Applicatio with ArgoCD

  1. Open the ArgoCD UI in your browser.
  2. Click + NEW APP.
  3. Fill in the following:
  • Application Name: project
  • Project Name: default
  • Sync Policy: Automatic
  • Repository URL:https://github.com/vsaraths/Deploy--E-Commerce-Application-eks-microservices-platform-11-Services-.git
  • Revision: HEAD
  • Path: kubernetes-files
  • Cluster URL: https://kubernetes.default.svc
  • Namespace: dev
  • Click Create.

Step 17: Create a Jenkins Pipeline Job for Backend and frondend & Route 53 Setup

Enable HTTPS for vsarath.site with AWS Classic Load Balancer (CLB)

This guide explains how to configure HTTPS for your domain aluru.site using AWS Classic Load Balancer (CLB), Route 53, and AWS Certificate Manager (ACM).

✅ Prerequisites

  • A working application (e.g., on EC2 or Kubernetes).
  • A registered domain: aluru.site
  • Domain is managed in Route 53 as a Public Hosted Zone.
  1. Go to AWS Route 53
  2. Create a Hosted Zone:
  3. Domain: aluru.site
  4. Type: Public Hosted Zone
  5. Update Hostinger Nameservers:

Paste the 4 NS records from Route 53 into Hostinger:

  • ns-865.awsdns-84.net
  • ns-1995.awsdns-97.co.uk
  • ns-1418.awsdns-59.org
  • ns-265.awsdns-73.com Your Classic Load Balancer is running and serving HTTP on port 80 or 8080.

Step 1: Request a Public Certificate in ACM

  1. Go to AWS Certificate Manager (ACM).
  2. Click Request Certificate.
  3. Choose Request a Public Certificate.
  4. Enter domain:
  5. vsarath.site
  6. www.vsarath.site (optional)
  7. Choose DNS validation.

Press enter or click to view image in full size

  1. Click Request.
  2. After request:
  • Click Create DNS record in Route 53.
  • ACM will create the _acme-challenge CNAME record.
  1. Wait a few minutes until status becomes Issued.

Step 2: Add HTTPS Listener to CLB

  1. Go to EC2 Console > Load Balancers.
  2. Select your Classic Load Balancer.
  3. Go to Listeners tab.
  4. Click Add Listener (or edit existing 443):
  5. Protocol: HTTPS
  6. Load Balancer Port: 443
  7. Instance Protocol: HTTP (or HTTPS if applicable)
  8. Instance Port: 80 (or 8080 if your app runs there)
  9. SSL Certificate: Choose the one for aluru.site
  10. Security Policy: Select ELBSecurityPolicy-2021–06
  11. Click Save.

Step 3: Update Security Group Rules
Go to your EC2 or Load Balancer Security Group:

  • Add Inbound Rule:
  • Type: HTTPS
  • Protocol: TCP
  • Port: 443
  • Source: 0.0.0.0/0

Ensure existing rules allow HTTP (port 80) or your backend port.

Step 4: Configure DNS in Route 53

  1. In ArgoCD UI, open your project application.
  2. Click on frontend and copy the hostname (e.g., acfb06fba08834577a50e43724d328e3-1568967602.us-east-1.elb.amazonaws.com).
  3. Go to Route 53 > Hosted Zones.
  4. Select aluru.site.
  5. Click Create Record:
  • Record name: leave blank (for root domain)
  • Record type: A — Routes traffic to an IPv4 address and AWS resource
  • Alias: Yes
  • Alias target: Choose Application and Classic Load Balancer
  • Region: US East (N. Virginia)
  • Alias target value: Paste the frontend load balancer DNS (from step 2)
  • Click Create Record.

Step 5: Test Your Setup
Using Browser

Visit:
https://vsarathsite
Enter fullscreen mode Exit fullscreen mode

You should see your application load securely over HTTPS.

Using curl

curl -v https://vsarath.site
Enter fullscreen mode Exit fullscreen mode

Expect HTTP 200 OK or the actual page content.

if everything is okay you will see the below website

If no DNS or domain you can still access the website with loadbalancer link

Troubleshooting
HTTPS times out?

  • Check port 443 is open in Security Group.
  • Make sure your app is reachable from the CLB.
  • ACM certificate must be in Issued status.
  • HTTP works but HTTPS doesn’t?

Listener or certificate may not be configured properly.
Check the load balancer health check passes.

Monitoring Steps

Kubernetes Monitoring: Step-by-Step with Prometheus & Grafana

Install Prometheus + Grafana to Monitor Your K8s Cluster
📊 Monitor your Argo CD–deployed website (running via LoadBalancer) — with Prometheus + Grafana
🔧 View CPU, RAM, pod status, uptime, errors, etc.

🧰 Prerequisites (Before We Start)
Make sure you have these ready 👇
1️. A Kubernetes Cluster (EKS, GKE, Minikube — anything works)
2️. kubectl is installed and connected to your cluster
3️. Helm is installed (helm version)

Install Helm (if not installed)

curl https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 | bash
Enter fullscreen mode Exit fullscreen mode

helm version
4️. Internet access to pull charts & Docker images
5️. (Optional) Argo CD if you want GitOps deployment
If you’re using GitOps, ensure:
☑ Argo CD is already deployed
☑ Your app is deployed using Argo CD
☑ Access to the app via Load Balancer

1️⃣ Create a Namespace for Monitoring

kubectl create namespace monitoring
Enter fullscreen mode Exit fullscreen mode

2️⃣ Add Prometheus & Grafana Helm Chart Repo

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
Enter fullscreen mode Exit fullscreen mode

3️⃣ Install the Kube Prometheus Stack (Includes Prometheus + Grafana)

helm install kube-prom-stack prometheus-community/kube-prometheus-stack \
  --namespace monitoring
Enter fullscreen mode Exit fullscreen mode

🛠️ This installs:

Prometheus (metrics collector)
Grafana (dashboard visualizer)
Alertmanager (for warnings)
Node exporters (to get node metrics)
4️⃣Check That Everything Is Runnin

kubectl get pods -n monitoring
You will get output like this :
Enter fullscreen mode Exit fullscreen mode
NAME                                                     READY   STATUS    RESTARTS   AGE
alertmanager-kube-prom-stack-kube-prome-alertmanager-0   2/2     Running   0          2m45s
kube-prom-stack-grafana-d5dfd9fd-m5j9t                   3/3     Running   0          3m19s
kube-prom-stack-kube-prome-operator-6779bc5685-llmc8     1/1     Running   0          3m19s
kube-prom-stack-kube-state-metrics-6c4dc9d54-w48xj       1/1     Running   0          3m19s
kube-prom-stack-prometheus-node-exporter-vhncz           1/1     Running   0          3m19s
kube-prom-stack-prometheus-node-exporter-vx56f           1/1     Running   0          3m19s
prometheus-kube-prom-stack-kube-prome-prometheus-0       2/2     Running   0          2m45s
Enter fullscreen mode Exit fullscreen mode

✅ Wait until STATUS is Running.

5️⃣. Accessing the Grafana UI Using LoadBalancer
Prometheus stack exposes Grafana as an internal service by default. Let’s expose it to the world 🌍.

Edit the Grafana Server File

kubectl edit svc kube-prom-stack-grafana -n monitoring
Enter fullscreen mode Exit fullscreen mode

Change the Service Type :
Find this line:

type: ClusterIP
Enter fullscreen mode Exit fullscreen mode

Change it to:

type: LoadBalancer
Enter fullscreen mode Exit fullscreen mode

Save and exit (:wq for vi).

6️⃣. Get the Grafana LoadBalancer IP

kubectl get svc kube-prom-stack-grafana -n monitoring
Enter fullscreen mode Exit fullscreen mode

You will get output like this :
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-prom-stack-grafana LoadBalancer 172.20.174.208 abbda6b6f6c9345c6b017c020cf00122-1809047356.us-east-1.elb.amazonaws.com 80:32242/TCP 5m39s
📌 Copy the EXTERNAL-IP, it will look like: http://a1b2c3d4.us-east-1.elb.amazonaws.com

7️⃣Accessing the Grafana UI
Open Your Browser
Copy the EXTERNAL-IP (the long .elb.amazonaws.com address).
Paste it into your browser:

You’ll see the Grafana login page!

  1. 🔐 Get the Initial Grafana Admin Password
kubectl get secret kube-prom-stack-grafana -n monitoring -o jsonpath="{.data.admin-password}" | base64 -d && echo
Enter fullscreen mode Exit fullscreen mode

You will get output like this :
prom-operator
Login to Grafana :
Go to your LoadBalancer IP in browser.
Default credentials:
Username: admin
Password: prom-operator
👉 Change the password when prompted.
You will like this

8️⃣ Add Kubernetes Dashboards in Grafana
📊 Go to:
Left menu → Dashboards → + Import → New Dashboard
Paste Dashboard ID:

In the text box under “Import via Grafana.com”, paste this number:

Otherwise you can use Json Files in this repository You can download Jason files and you can upload In the dashboard.

Become a member
Kubernetes Cluster Monitoring (ID: 315)
Kubernetes Pods/Containers (ID: 3662)
Kubernetes Deployments (ID: 1621)
Kubernetes API Server (ID: 12006)
Kubernetes Nodes (ID: 6417)
Kubernetes Namespace Monitoring (ID: 10000)
Kubernetes Persistent Volumes (ID: 13602)
Kubernetes Networking (ID: 15758)
NGINX Ingress Controller (ID: 9614)

Click Load
Grafana expects one dashboard ID at a time in the “Grafana.com dashboard URL or ID” field.
For example:

  1. You can enter 315, click Load, and then import that dashboard.
  2. You must repeat this for each dashboard ID (e.g., 3662, 1621, etc.).

2. Select Data source
On the next screen:
You’ll see a dropdown Prometheus Data Source.
Choose Prometheus (it’s already installed with kube-prometheus-stack).
Then click Import.

3. View the Dashboard
After importing, the dashboard will automatically open.
You’ll now see:
CPU usage per Node (which node is using the most CPU).
Memory usage per Pod (how much RAM each pod is using).
Cluster Uptime.
Requests, errors, etc.

9️⃣ See Your Argo CD App Metrics!
All apps running in your cluster (including ones deployed via Argo CD) are automatically monitored!

You can import the Argo CD dashboard from Grafana.com:

Go to Dashboards → Import.
Use Dashboard ID 14584 (Argo CD Official Dashboard).
Select Prometheus as the data source.

[Optional] Configure Alerts (Email Notifications
When CPU usage (or any other metric) goes beyond a threshold, you’ll receive an email alert from Alertmanager (part of Prometheus stack).

⓵ Confirm Alertmanager is Running

After installing the kube-prometheus-stack:

kubectl get pods -n monitoring
Enter fullscreen mode Exit fullscreen mode

Look for something like:

alertmanager-kube-prom-stack-kube-prome-alertmanager-0
Enter fullscreen mode Exit fullscreen mode

✅ If it’s running, you’re good to go.
⓶ The Alertmanager dashboard provides:
Active Alerts — A list of alerts currently firing (e.g., high CPU usage).
Silences — You can configure silences to suppress certain alerts.
Status — Displays cluster and configuration status.
Receivers — Configured receivers like email, Slack, etc.
Routes — The routing tree for alert notifications.
⓷ Edit the Alertmanager Config Ma
The configuration for Alertmanager is stored in a ConfigMap.
Run:

kubectl get pods -n monitoring | grep alertmanage
Ensure:
Enter fullscreen mode Exit fullscreen mode

alertmanager-kube-prom-stack-kube-prome-alertmanager-0 2/2 Running
✅ To Access Alertmanager via Load Balancer (Externally):

You need to change the service type from ClusterIP to LoadBalancer.

🔧 Step 1: Edit the Alertmanager Service

Run:

kubectl edit svc kube-prom-stack-kube-prome-alertmanager -n monitoring
Enter fullscreen mode Exit fullscreen mode

🔄 Step 2: Change This Line:

Find:

type: ClusterIP

Enter fullscreen mode Exit fullscreen mode

Change it to:

type: LoadBalancer
Enter fullscreen mode Exit fullscreen mode

Save and exit (:wq if using vim)

⏳ Step 3: Wait for External IP

Check again with:


kubectl get svc kube-prom-stack-kube-prome-alertmanager -n monitorin
Enter fullscreen mode Exit fullscreen mode

It will show something like:

kube-prom-stack-kube-prome-alertmanager   LoadBalancer   172.20.51.43   abc123456789.elb.amazonaws.com   9093:xxxxx/TCP   ...
Enter fullscreen mode Exit fullscreen mode

✅ Copy the DNS under EXTERNAL-IP

🌐 Step 4: Access Alertmanager in Browser

Use:

http://<external-dns>:9093
Enter fullscreen mode Exit fullscreen mode

Example:

http://abc123456789.elb.amazonaws.com:9093
Enter fullscreen mode Exit fullscreen mode

✅ Optional: Open Port 9093 in Security Group

If it doesn’t load:

Go to AWS EC2 Console → Load Balancers
Find the Alertmanager ELB
Open the Security Group
Edit Inbound Rules:
Add rule for TCP 9093
Source: 0.0.0.0/0 (or your IP)
run this commends

kubectl get secret alertmanager-kube-prom-stack-kube-prome-alertmanager -n monitoring -o jsonpath='{.data.alertmanager\.yaml}' | base64 --decode > alertmanager.yaml
ls
vim alertmanager.yaml
Enter fullscreen mode Exit fullscreen mode

⓸ Add Email Configuration

Inside the alertmanager.yml section, add your SMTP email settings:
(Replace with your email SMTP provider details.)

global:
  smtp_smarthost: 'smtp.gmail.com:587'      # Your SMTP server
  smtp_from: 'yaswanth.arumulla@hmail.com'         # Sender email
  smtp_auth_username: 'yaswanth.arumulla@gmai.com'
  smtp_auth_password: 'your-app-password'   # Use app password (not your real password!)
route:
  receiver: 'email-alert'
receivers:
  - name: 'email-alert'
    email_configs:
      - to: 'yaswanth.arumulla@gmail.com'      # Where to send alerts
        send_resolved: true
Enter fullscreen mode Exit fullscreen mode

⚠️ For Gmail:

Enable “Less Secure Apps” or create an App Password from Google account security settings.

kubectl create secret generic alertmanager-kube-prom-stack-kube-prome-alertmanager \
  --from-file=alertmanager.yaml \
  -n monitoring \
  --dry-run=client -o yaml | kubectl apply -f -
Enter fullscreen mode Exit fullscreen mode

⓹ Save and Restart Alertmanager

After editing, restart the Alertmanager pod to apply changes:

kubectl delete pod alertmanager-kube-prom-stack-kube-prome-alertmanager-0 -n monitorin
Enter fullscreen mode Exit fullscreen mode

Wait for Restart

kubectl get pods -n monitoring -w
Enter fullscreen mode Exit fullscreen mode

Look for:

alertmanager-kube-prom-stack-kube-prome-alertmanager-0   2/2   Running   0   30s
Enter fullscreen mode Exit fullscreen mode

⓺ Create an Alert Rule (CPU Example)

We’ll add an alert rule to trigger an email when CPU > 70%.

Create a new YAML file called cpu-alert-rule.yaml:

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: cpu-alert
  namespace: monitoring
spec:
  groups:
    - name: cpu.rules
      rules:
        - alert: HighCPUUsage
          expr: sum(rate(container_cpu_usage_seconds_total[1m])) > 0.7
          for: 2m
          labels:
            severity: warning
          annotations:
            summary: "High CPU Usage detected"
            description: "CPU usage is above 70% for 2 minutes."
Apply the rule:

Enter fullscreen mode Exit fullscreen mode
kubectl apply -f cpu-alert-rule.yaml
Enter fullscreen mode Exit fullscreen mode

⓻ Test Your Alert

Run a CPU-heavy process in a pod (simulate load).
Wait 2–3 minutes.
Check your email — you should receive an alert!
You can also see alerts in the Prometheus UI:

Step 1: Edit the Prometheus Service

Run:

kubectl edit svc kube-prom-stack-kube-prome-prometheus -n monitoring
Enter fullscreen mode Exit fullscreen mode

Change the Service Type :

Find this line:

type: ClusterIP
Enter fullscreen mode Exit fullscreen mode

Change it to:

type: LoadBalancer
Enter fullscreen mode Exit fullscreen mode

Save and exit (:wq for vi).

Get the Prometheus LoadBalancer IP

kubectl get svc kube-prom-stack-kube-prome-prometheus -n monitoring
Enter fullscreen mode Exit fullscreen mode

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE

kube-prom-stack-kube-prome-prometheus   LoadBalancer   172.20.146.205   a17530e645b134734ba1cff112072526-1666914053.us-east-1.elb.amazonaws.com   9090:32606/TCP,8080:31797/TCP   3h35m
Enter fullscreen mode Exit fullscreen mode

📌 Copy the EXTERNAL-IP, it will look like: http://a1b2c3d4.us-east-1.elb.amazonaws.com:9090

Done!

Now, you’ll get email alerts whenever CPU usage crosses the limit.

🎉 Final Checklist

✅ Prometheus & Grafana Installed
✅ Grafana Accessible via LoadBalancer
✅ Kubernetes Metrics Visible
✅ Argo CD Deployed App Visible
✅ Dashboards Working
✅ Optional Alerts Configured

Bonus: What You Can Monitor

✅ CPU/RAM of your Argo CD app
✅ Pod crashes/restarts
✅ Node health
✅ Cluster capacity
✅ Response times
✅ Resource usage per container

Top comments (0)