DEV Community

Mumtaz Jahan
Mumtaz Jahan

Posted on

Why Your AWS Bill Doubled Overnight (And How to Plug the Leaks)

Why Your AWS Bill Doubled Overnight (And How to Plug the Leaks)

We've all been there.

You open the AWS Billing Dashboard, expecting the usual $50–$100, only to see a vertical spike that looks like a mountain range. The immediate reaction is:

"We must have massive traffic!"

But let's be real — traffic rarely doubles overnight. Your misconfigurations, however, certainly can.

If you're staring down a bill that's spiraling out of control, here is your emergency checklist to find the invisible drains on your budget.


1. The NAT Gateway "Processing" Trap

NAT Gateways are the silent killers of AWS budgets. You aren't just paying for the uptime — you're paying for every gigabyte that passes through.

The Leak:
Sending high-bandwidth internal traffic (like S3 uploads) through a NAT Gateway instead of using a VPC Endpoint.

The Fix:
Use VPC Endpoints for S3 and DynamoDB to keep that traffic off the expensive NAT "highway."

# Check your NAT Gateway data transfer costs
aws ec2 describe-nat-gateways --query 'NatGateways[*].{ID:NatGatewayId,State:State}'
Enter fullscreen mode Exit fullscreen mode

A single misconfigured service pushing gigabytes through NAT can silently add hundreds of dollars to your bill.


2. Cross-AZ Data Transfer — The Invisible Tax

High availability is great, but cross-Availability Zone (AZ) traffic comes with a literal invisible tax.

The Leak:
Your app server in us-east-1a is constantly chatting with a database in us-east-1b.

The Fix:
Keep your "chatty" services within the same AZ where possible, or use Service Discovery to prioritize local traffic.

# Check which AZ your instances are running in
aws ec2 describe-instances --query 'Reservations[*].Instances[*].{ID:InstanceId,AZ:Placement.AvailabilityZone}'
Enter fullscreen mode Exit fullscreen mode

3. Ghost EBS Volumes

When you terminate an EC2 instance, the Elastic Block Store (EBS) volume doesn't always go away with it.

The Leak:
"Unattached" volumes sitting in your console, doing absolutely nothing except costing you monthly rent.

The Fix:
Go to EC2 Console → Volumes → Filter by State = Available

# Find all unattached EBS volumes via CLI
aws ec2 describe-volumes --filters Name=status,Values=available \
--query 'Volumes[*].{ID:VolumeId,Size:Size,State:State}'
Enter fullscreen mode Exit fullscreen mode

If it's not In-use — delete it or snapshot it and move on.


4. Broken Auto Scaling

Auto Scaling is designed to save you money, but it only works if it knows how to breathe.

The Leak:
Your "Scale Up" policy works perfectly during peak hours, but your "Scale Down" policy is either missing or blocked by a single stuck process.

The Fix:
Audit your CloudWatch alarms. Ensure your cooldown periods aren't too long and that your termination policies are actually firing.

# List your Auto Scaling groups and their activities
aws autoscaling describe-scaling-activities --auto-scaling-group-name your-group-name
Enter fullscreen mode Exit fullscreen mode

5. The CloudWatch Ingestion Spike

Logs are vital — until they cost more than the app they're monitoring.

The Leak:
You left a service in Debug mode, and now you're paying for terabytes of CloudWatch log ingestion.

The Fix:
Set a retention policy. Don't keep logs for "Forever" by default.

# Set a 30-day retention policy on a log group
aws logs put-retention-policy \
  --log-group-name /your/log/group \
  --retention-in-days 30
Enter fullscreen mode Exit fullscreen mode

14 to 30 days is usually plenty for dev environments.


6. S3 Without a Lifecycle Policy

Storage is cheap — but it's not free.

The Leak:
Storing every version of every file in Standard Storage for years with no cleanup plan.

The Fix:
Implement S3 Lifecycle Policies to move old data automatically.

{
  "Rules": [{
    "Status": "Enabled",
    "Transitions": [{
      "Days": 30,
      "StorageClass": "STANDARD_IA"
    }, {
      "Days": 90,
      "StorageClass": "GLACIER"
    }]
  }]
}
Enter fullscreen mode Exit fullscreen mode

Move to Infrequent Access after 30 days. Move to Glacier after 90. Your future self will thank you.


7. Idle Load Balancers

An ALB (Application Load Balancer) costs roughly $16–$20/month just to exist — even if nothing is using it.

The Leak:
Leftover load balancers from a project or staging environment you forgot to tear down.

The Fix:

# Find load balancers with no targets
aws elbv2 describe-load-balancers \
--query 'LoadBalancers[*].{Name:LoadBalancerName,DNS:DNSName}'
Enter fullscreen mode Exit fullscreen mode

If it has zero targets and zero requests — delete it immediately.


8. Snapshot Hoarding

Backups are important — but do you really need a snapshot of a test server from 2022?

The Leak:
Automated backups that never expire.

The Fix:
Use AWS Backup to centralize management and set hard expiration dates on snapshots.

# List all your snapshots and their ages
aws ec2 describe-snapshots --owner-ids self \
--query 'Snapshots[*].{ID:SnapshotId,Size:VolumeSize,Date:StartTime}'
Enter fullscreen mode Exit fullscreen mode

Quick Emergency Checklist

# Leak Quick Fix
1 NAT Gateway traffic Use VPC Endpoints
2 Cross-AZ traffic Keep services in same AZ
3 Ghost EBS volumes Filter Available → Delete
4 Broken Auto Scaling Audit CloudWatch alarms
5 CloudWatch debug logs Set 14-30 day retention
6 S3 no lifecycle Add Lifecycle Policy
7 Idle Load Balancers Zero targets → Delete
8 Old snapshots Set expiration in AWS Backup

The Bottom Line

AWS is a "pay-for-what-you-use" model — but if you aren't careful, you're also "paying-for-what-you-forgot-to-turn-off."

Run through this checklist every month. Set a calendar reminder. Your AWS bill will thank you.


*What's the biggest hidden cost you've ever found in your AWS bill? Drop it in the comments *


Top comments (0)