Over the last two quarters, I focused heavily on optimizing my AWS infrastructure by applying cloud cost optimization best practice. Through systematic analysis, observing cost spend and analyze spend report, I was able to reduce AWS infrastructure cost by nearly 40% without impacting system reliability or performance.
In this post, I am going to share teh approach I have followed, the optimizations that dellivered the biggest savings.
The Challenge
Before October 2025 monthly cost of AWS around 60K per month, Monthly cost spiked nearly 15% of a month. While the architecture was stable and scalable, there was significant room for cost optimization.
The goal was clear: reduce infrastructure costs without compromising performance, reliability or scalability.
Step 1: Gaining Cost Visibility
The verry first was understanding where the money was actually going.
With the help of Billing section of AWS and Cost Explorer I was able to find the major culprit of the high cost. Which was S3 bucket and CloudFront.
Step 2: Determine The Underlying Issue
After carefully oberserving the S3 spend found there was a spike in the S3 storage also there was a increasing curve in the GET request of the S3.
With all these details did deeper analysis in the application and found there was a bug in the application which was generating the duplicate images, and these was the reason why S3 cost got spiked.
In figures the S3 storage went from 100TB to 600TB storage, also there was a multi-region buckets so the data-transfer cost was at it's peak.
This says a small bug in the application can cost very huge!
Step 3: What I did to bring cost down again?
Since it was a application bug first thing was to fix it, with the help of developers fix that bug first. Now the big challenge was to delete this duplicates images without increasing cost
Images count was in the Quadrillion because of this duplication bug and versioning, now removing these many images can cost because if we just call delete api to delete all images it will first list all images and we all know AWS is charging to list the objects, so just delete all the images was not an option, what else I can do 🤔
What good thing was in the Database I have details available what keys are required or we can say this is main image rest are duplicates, I created one PSQL DB query that gave me one CSV file which was containing all duplicate images name or we can say key.
Then I have created one python script which will delete all the images which listed in the CSV file, now images were in trillion so I have written script in a way that it will delete at-least 50,000 images in a seconds. And I got sucess, my script has marked as deleted trillions images in just few days and S3 lifecycle policies helped them to remove them instantly.
There was a still some more grey area of the application which we work on it and fix it and continuesly save the cost. Finally a Hard work in the Novemember and clean process in the Dec we finally able to see a good result in the Jan when cost was below 45K of month.
The Result
- AWS Cost reduced by ~40%
- No degradation in performance and reliability
- Finetune application and infrastructure without increasing cost.
Key Lessons Learned
- Monthly Cost Analysis can help to track the cost
- Try to add maximum visbility in the cost which will be a magic wand to reduce the cost
- Small steps will compound over time



Top comments (1)
Incredible case study, Harsh! Dealing with quadrillions of objects is a total nightmare smart move bypassing those expensive LIST calls with a custom PSQL to Python script.