Germán Neironi

Posted on Jan 13

5 AWS Resources You're Probably Overpaying For (And How to Fix Each One)

#aws #devops #cloud #finop

Last month I analyzed 249 EC2 instances across 3 AWS accounts for a mid-market fintech company. The result? $256,000/year in potential savings. That's $21,000 every month going to waste.

The scary part: this isn't unusual. According to Flexera's 2024 State of the Cloud Report, 32% of cloud spend is wasted. Most companies have no idea they're overpaying.

Here are the 5 most common culprits I find in almost every AWS account—and how to fix them.

1. Oversized EC2 Instances

The problem: You launched a t3.xlarge because you weren't sure what you'd need. Now it's been running at 5% CPU for 6 months.

How common: In my analysis, 70-80% of instances were oversized by at least one size class.

How to detect:

# Check average CPU over last 14 days
aws cloudwatch get-metric-statistics \
  --namespace AWS/EC2 \
  --metric-name CPUUtilization \
  --dimensions Name=InstanceId,Value=i-xxxxx \
  --start-time $(date -d '14 days ago' +%Y-%m-%dT%H:%M:%S) \
  --end-time $(date +%Y-%m-%dT%H:%M:%S) \
  --period 86400 \
  --statistics Average

The fix: If average CPU is below 20% for 2+ weeks, downsize. A t3.medium at $30/month vs t3.xlarge at $120/month adds up fast.

Potential savings: 50-75% per instance

2. Unattached EBS Volumes

The problem: You terminated an instance but forgot to delete its volumes. Now you're paying for storage nobody uses.

How common: I typically find 5-15 orphan volumes per account.

How to detect:

aws ec2 describe-volumes \
  --filters Name=status,Values=available \
  --query 'Volumes[*].[VolumeId,Size,CreateTime]' \
  --output table

If a volume shows status: available, it's not attached to anything.

The fix: Delete them. But first, create a snapshot if you're paranoid (snapshots are ~$0.05/GB/month vs $0.10/GB/month for volumes).

Potential savings: $0.10/GB/month per volume

3. Old EBS Snapshots

The problem: Automated backups create snapshots daily. Nobody deletes them. A year later you have 365 snapshots of the same volume.

How to detect:

aws ec2 describe-snapshots \
  --owner-ids self \
  --query 'Snapshots[?StartTime<=`2024-01-01`].[SnapshotId,VolumeSize,StartTime]' \
  --output table

The fix:

Keep the last 7-30 days (depending on your compliance needs)
Delete everything older
Set up a lifecycle policy with AWS Data Lifecycle Manager

Potential savings: Can be thousands/month for active accounts

4. EBS gp2 Volumes (Instead of gp3)

The problem: gp3 launched in 2020 with 20% lower base cost AND better performance. Yet most accounts still have gp2 volumes because "it works, why change it?"

How common: In my analysis, 60%+ of volumes were still gp2.

How to detect:

aws ec2 describe-volumes \
  --query 'Volumes[?VolumeType==`gp2`].[VolumeId,Size,VolumeType]' \
  --output table

The fix: Migrate to gp3. It's a live operation, no downtime required:

aws ec2 modify-volume \
  --volume-id vol-xxxxx \
  --volume-type gp3

Potential savings: 20% per volume, zero effort

5. Idle RDS Instances

The problem: A dev database someone created for testing 8 months ago. A staging DB that nobody uses anymore. They're running 24/7.

How common: 1-3 per account, often in non-prod environments.

How to detect: Check connections over the last 14 days:

aws cloudwatch get-metric-statistics \
  --namespace AWS/RDS \
  --metric-name DatabaseConnections \
  --dimensions Name=DBInstanceIdentifier,Value=my-db \
  --start-time $(date -d '14 days ago' +%Y-%m-%dT%H:%M:%S) \
  --end-time $(date +%Y-%m-%dT%H:%M:%S) \
  --period 86400 \
  --statistics Maximum

If max connections = 0 for 2 weeks, it's idle.

The fix:

For dev/test: Stop it (you can start it when needed)
For truly unused: Take a final snapshot and delete it
Consider Aurora Serverless for dev workloads (scales to zero)

Potential savings: $50-500/month per idle instance

Real-World Example: Mid-Market Fintech

Here's a real analysis I ran recently for a fintech company with 3 AWS accounts:

Metric	Value
EC2 instances analyzed	249
Recommendations generated	72
Monthly savings potential	$21,340
Annual savings potential	$256,076

The breakdown:

Oversized instances: 60%+ were running at <20% average CPU
gp2 → gp3 migrations: Immediate 20% savings, zero downtime required
Orphan resources: Multiple unattached volumes and outdated snapshots

This aligns with industry benchmarks—Flexera reports 32% of cloud spend is typically wasted, and Gartner finds 2-3x overprovisioning is common.

The Real Problem

Running these checks manually is tedious. And even if you do it once, waste accumulates again within weeks.

That's why I built CloudPruneAI—it scans your AWS accounts automatically and generates Infrastructure as Code (CDK) to implement the fixes. Instead of a report you'll forget about, you get deployable code.

But even if you never use a tool, run these 5 checks today. You might be surprised what you find.

What's the biggest AWS cost surprise you've discovered? I'd love to hear your stories in the comments.

Top comments (9)

Travis Wilson • Jan 14

I love how you not only presented the problem but showed the examples on how to fix them. We're not running in AWS but this is a common problem across all providers like you mention.

We're not running on AWS but I have worked at other places before and a personal one for me was when I left a m5.6xlarge on over the long weekend......

Germán Neironi • Jan 14 • Edited

Hi @travis Thanks for your feedback, happy to know You enjoy the post. You mentioned You are not running on AWS, where are You running? Do You think this kind of tools can be usefull Cross-plataform? I mean, Azure, GCP, Oracle. And maybe add terraform as output code?

Travis Wilson • Jan 14

Hey! We're on GCP right now, running cloud run (cause lazy) but if things start taking off for us we'll look to move into GKE and start breaking the app into smaller deployable modules. In regards to cross-platform you better believe people are wasting money and with many people tighting the spending belt they're going to be looking for ways to save money and infra costs is an easy way to do that. Do ya'll monitor logging costs too? I know at a former company we once got up to 40k a month in logs. Another thought tho is what edge do you have over just going to the billing dashboard?

Germán Neironi • Jan 14

Hey Travis! Great questions.

About logging costs: Yes! CloudWatch Logs is actually one of our main targets. We detect log groups without retention policies (the silent killer - logs accumulating forever). Your $40k/month story is exactly what we help prevent.

On the edge over billing dashboard: The billing dashboard tells you what you're spending, but not why or how to fix it. Cloudpruneai goes deeper:

We analyze resource utilization (CPU, memory, connections) not just cost
We identify the root cause (oversized instance vs zombie vs missing lifecycle policy)
Most importantly: we generate deployable CDK code to fix it
So instead of "you spent $5K on EC2" → you get "instance i-abc123 is using 8% CPU, here's the code to downsize it to t3.medium and save $840/year"

About multi-cloud/Terraform, it's definitely on our radar. AWS + CDK first to nail the experience, then expand. Would GCP + Terraform be interesting for your team when you scale to GKE?

Travis Wilson • Jan 15

I love this product idea, whats your strategy for getting users/customers? I'm in that stage right now and just winging it with posting and sharing with friends/former coworkers.

Germán Neironi • Jan 15

I'm in the same boat, networking.
I also have a large consulting firm in the US that I'm going to use as a channel through some kind of partnership.

Travis Wilson • Jan 15

Ha! You'll have to keep me updated how the consulting firm works out. I'm still trying to work through some personal connections (friends/former coworkers) but nothing yet.

Germán Neironi • Jan 15

For sure, I'll kip you updated.

I saw your landing, your project looks great. Congrats!

Some comments may only be visible to logged-in visitors. Sign in to view all comments.