published: João Vitor Nascimento de mendonça
description: How to move from "scaling at all costs" to "resource stewardship" using VPA, Graviton, and Spot Instances.
tags: aws, kubernetes, finops, architecture
cover_image: https://dev-to-uploads.s3.amazonaws.com/uploads/articles/example-cloud-cost.png
series: Modern Infrastructure Series
Part 2: The Post BodyPaste this into the main content area.1. The Fallacy of "Infinite" CloudIn the early days of Cloud adoption, the mantra was "scale at all costs." In 2026, the industry has shifted. The new gold standard is Resource Stewardship.I recently audited a microservices environment where the cloud bill was growing faster than the user base. The culprit wasn't traffic—it was over-provisioning and a lack of cost-awareness in the development cycle. Here is how we re-engineered the platform for efficiency.2. Data-Driven Right-SizingWe stopped relying on "gut feelings" for Kubernetes resource requests. Instead of guessing how much memory a service needed, we implemented Vertical Pod Autoscaler (VPA) in recommendation mode.The Insight: We discovered that 60% of our services were using less than 20% of their allocated CPU.The Action: We automated the adjustment of requests and limits to match real-world $P95$ usage.The Result: A 35% reduction in wasted cluster capacity overnight.3. Embracing Spot Instances and ARM64We re-architected non-critical workloads to run on Amazon EC2 Spot Instances paired with AWS Graviton (ARM64) processors.Handling InterruptionsTo use Spot instances safely, we implemented graceful shutdown handlers to catch the 2-minute interruption notice:Go// Simplified logic for Spot Interruption handling
func handleTermination() {
termChan := make(chan os.Signal, 1)
signal.Notify(termChan, syscall.SIGTERM)
<-termChan // Interruption signal received
log.Println("Spot instance terminating. Shifting state to Redis...")
// Logic to drain connections and save state
cache.SaveWorkerState(currentJobs)
os.Exit(0)
}
Outcome: This shift resulted in a 60% cost reduction for our CI/CD pipelines and data processing workers.4. "FinOps as Code"Engineering excellence is no longer just about uptime; it's about financial visibility. We integrated cost-estimation tools (like Infracost) directly into our Terraform pipelines.Every Pull Request now displays an estimated monthly cost change. If a developer tries to provision a massively oversized RDS instance, the system flags it during the code review phase, not when the bill arrives.5. ConclusionA great architect builds systems that are as lean as they are powerful. By treating cost as a first-class metric—right alongside latency and availability—we build more sustainable technology.How is your team handling cloud costs this year? Let’s discuss below!
Top comments (0)