Kubernetes Cost Optimization: How We Saved £1.2 Million in 9 Months — Without Turning Anything Off
By Meena Nukala
Senior DevOps Engineer | 10+ years | AWS DevOps Engineer Professional, CKA, CKS, Terraform Associate & 4 more
Published: 11 December 2025
In early 2024 our monthly AWS bill for Kubernetes clusters hit £420,000 — and we still had developers complaining about throttled pods.
Nine months later the same workloads cost £220,000/month.
We never shut down a single business-critical service, never forced spot instances on anyone, and never compromised SLAs.
Here’s exactly how we cut the bill by 47.6 % (£1.2 M annualized) using tools that exist today in 2025.
The Starting Point (Jan 2024)
- 28 EKS clusters (1.27 → 1.29)
- 4,800 vCPU & 18 TiB memory provisioned
- Average node utilization: 34 %
- Spot usage: < 8 %
- Monthly bill: £420 k
The 5 Levers That Actually Moved the Needle
1. Karpenter + Intelligent Consolidation (Biggest single win: £480 k/year)
We replaced Cluster Autoscaler with Karpenter 1.0 (released stable 2024).
Key settings that paid for themselves in week one:
# karpenter.sh/consolidateAfter: 120s (instead of "Never")
# karpenter.sh/expireAfter: 720h
provisioners:
- requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ["spot", "on-demand"]
consolidation:
enabled: true
weight: 100
Result: Karpenter deleted 40–60 % of idle nodes every night and re-packed workloads onto fewer, cheaper instances. No manual bin-packing required.
2. Vertical Pod Autoscaler + Goldilocks (Saved £310 k/year)
We ran Goldilocks (open-source VPA recommender) in every namespace for 2 weeks, then applied 98 % of its suggestions automatically via custom controller.
Before vs After (average across 1,200 pods):
| Resource | Old Request | New Request | Reduction |
|----------|-------------|-------------|-----------|
| CPU | 1.8 vCPU | 0.94 vCPU | 48 % |
| Memory | 6.2 GiB | 3.8 GiB | 39 % |
3. Spot Instances Done Right (£220 k/year)
We didn’t just “turn on spot” — we made it safe:
- Karpenter provisioners with fallback to on-demand in < 90 s
- Pod Disruption Budgets + node-group taints
- Critical workloads stayed on-demand, everything else spot
Final mix: 78 % spot, 22 % on-demand (zero forced evacuations in 9 months).
4. Right-Sizing Unused Reserved Instances & Savings Plans
We had £1.4 M in unused RIs from 2022.
Scripted monthly analysis → sold £680 k on AWS Marketplace → bought flexible Compute Savings Plans instead.
5. Storage & Networking (The “free” £110 k/year)
- Switched default GP2 volumes to GP3 (saved 20 % automatically)
- Enabled EKS CNI prefix delegation → reduced ENI count by 62 % → fewer NAT gateway hours
The Dashboard Everyone Loved
Public Grafana dashboard (feel free to import):
https://github.com/meenanukala/eks-cost-dashboard
Key panels we watched religiously:
- Daily cost per cluster (Cost Explorer + Prometheus)
- Karpenter consolidation events per hour
- Spot termination notices (zero in 9 months)
- Node utilization heat-map
Final Numbers (Sept 2024 — Audited by Finance)
| Category | Monthly Saving | Annualized |
|---|---|---|
| Karpenter consolidation | £40,000 | £480 k |
| VPA + Goldilocks | £26,000 | £310 k |
| Safe spot usage | £18,000 | £220 k |
| Storage & networking | £9,000 | £110 k |
| RI/SP rebalancing | £6,500 | £78 k |
| Total | £99,500 | £1.2 M |
The One-Page Playbook You Can Run Next Week
- Deploy Karpenter → enable consolidation
- Install Goldilocks → auto-apply VPA recommendations after 14 days
- Create spot-first provisioners with 90 s fallback
- Run my open-source cost-optimization GitHub Action nightly
- Sleep (your cloud bill is now on a diet)
Full working repo with all manifests, dashboards, and the exact GitHub Action:
https://github.com/meenanukala/eks-cost-optimization-2025
Closing Thought
In 2025, running Kubernetes without active cost governance is the new performance anti-pattern. The tools are mature, open-source, and boringly reliable.
The only thing stopping most companies from saving seven figures is someone willing to own it.
I just did.
— Meena Nukala
Senior DevOps Engineer | London → Sydney bound 2026
GitHub: github.com/meena-nukala-devops
LinkedIn: linkedin.com/in/meena-nukala
(Published 11 December 2025 — clap 50 times if you’re going to copy this playbook tomorrow!)
Top comments (0)