DEV Community

Anushka B
Anushka B

Posted on • Originally published at aicloudstrategist.com

AWS cost audit for Indian SaaS: a 24-hour methodology

Every Indian mid-market company with a monthly AWS bill above ₹5 lakh has the same conversation with itself once a quarter. Engineering says the bill is "probably fine". Finance says it feels high but can't prove it. A founder asks "what would a real audit find?" — and usually, nobody answers, because nobody has done one.

We have. Across a dozen pattern studies in ap-south-1 over the last year, the same seven leaks show up in almost every account. Not all seven every time — but rarely fewer than four. On a ₹5 lakh/month bill, the total recoverable spend across these seven patterns sits between 18% and 34%. That's ₹90,000 to ₹1.7 lakh a month, compounding.

This post walks through each of the seven, with the numbers we see in real audits, and the detection command you can run yourself before you call anyone.

Leak 1 — NAT Gateway processing bytes (the silent compounder)

AWS NAT Gateway is one of the most expensive per-GB services in the platform. Beyond the ₹4,000/month per-gateway fee, you pay ₹3.71 per GB processed. For a Kubernetes cluster routing all egress through a single NAT Gateway, 200 GB/day adds up to ₹22,000/month just in processing.

What we find: multi-AZ production clusters with three NAT Gateways pushing 500+ GB/day combined, because nobody noticed that a Prometheus remote-write was pointing at an internet endpoint instead of a VPC endpoint. Typical recovery on a ₹5L bill: ₹15,000–₹28,000/month.

Detection:

aws cloudwatch get-metric-statistics \
  --namespace AWS/NATGateway --metric-name BytesOutToDestination \
  --start-time $(date -u -d '7 days ago' +%Y-%m-%dT%H:%M:%SZ) \
  --end-time   $(date -u +%Y-%m-%dT%H:%M:%SZ) \
  --period 86400 --statistics Sum --region ap-south-1
Enter fullscreen mode Exit fullscreen mode

Fix: S3 Gateway Endpoint (free), ECR Interface Endpoint, and route CloudWatch Logs / SSM / Secrets Manager through VPC endpoints. Most NAT Gateway traffic in Indian mid-market clusters is AWS-to-AWS that should never have left the VPC.

Leak 2 — Orphaned EBS volumes (the forgotten ones)

Detailed in our earlier post on orphaned EBS volumes, this is the most consistently underestimated leak we see. EBS volumes in available state — detached but still billing at full provisioned size — accumulate every time a staging environment is torn down without DeleteOnTermination set.

Pattern-study median: 38 orphaned volumes, 4.2 TB, ₹4.2 lakh/year on a ₹18L/month account. Scaled to ₹5L/month: ₹8,000–₹15,000/month.

aws ec2 describe-volumes --region ap-south-1 \
  --filters Name=status,Values=available \
  --query 'Volumes[*].[VolumeId,Size,VolumeType,CreateTime]' \
  --output table
Enter fullscreen mode Exit fullscreen mode

Leak 3 — Idle RDS instances (the ghost databases)

Every mid-market SaaS account we've audited has at least one RDS instance whose DatabaseConnections has been zero for 30+ days but is still running, still billing, still taking automated backups. Most commonly: a staging copy someone forgot, or a db.m5.large that was replaced by an Aurora cluster but never terminated.

Typical finding: 2–4 idle RDS instances, combined monthly cost ₹12,000–₹24,000. On a ₹5L bill, that's 2–5% recovered immediately by shutting down what nobody uses.

# Zero-connection RDS instances over the last 30 days
aws cloudwatch get-metric-statistics \
  --namespace AWS/RDS --metric-name DatabaseConnections \
  --dimensions Name=DBInstanceIdentifier,Value=<your-instance> \
  --start-time $(date -u -d '30 days ago' +%Y-%m-%dT%H:%M:%SZ) \
  --end-time   $(date -u +%Y-%m-%dT%H:%M:%SZ) \
  --period 86400 --statistics Maximum --region ap-south-1
Enter fullscreen mode Exit fullscreen mode

Fix: a one-line governance rule — any RDS instance with zero connections for 14 consecutive days gets tagged candidate-for-termination. A human decides whether to kill or snapshot-and-kill. Nothing deletes automatically.

Leak 4 — Inter-region data egress (the architecture tax)

Our earlier post on cross-region egress covers this in depth. Short version: any data leaving ap-south-1 for us-east-1 (analytics pipelines, cold storage, ML training) costs ₹7 per GB. A 2 TB/month cross-region replication from production to an analytics data lake is ₹14,000/month. We've seen accounts where this hits ₹55,000/month because a poorly-designed Kinesis Firehose is flushing micro-batches cross-region.

On a ₹5L/month bill, cross-region egress typically sits at ₹6,000–₹18,000/month — and 80% of it is architecturally unnecessary.

Fix: compress before egress (gzip + Parquet drops 4–7x), batch more aggressively, and if the destination is archival, use S3 Cross-Region Replication rather than application-layer pipelines — CRR is significantly cheaper and AWS handles the retry logic.

Leak 5 — Missing Savings Plans / Reserved Instance coverage

The single largest recoverable amount in almost every Indian mid-market audit. Baseline AWS list pricing on EC2/RDS/Fargate compared to 1-year no-upfront Compute Savings Plans: 28–34% cheaper. On stable workloads, 3-year all-upfront SPs hit 52–60% savings.

What we find: coverage sitting at 18–32% when it should be 70–85% of baseline compute. That alone represents ₹50,000–₹1,10,000/month on a ₹5L bill.

The fix isn't just "buy more Savings Plans" — it's a governance pattern. We cover that in detail in our RI coverage governance writeup. The short form: treat SP purchases like quarterly Treasury decisions, not ad-hoc procurement events, and separate "baseline coverage" from "growth elasticity".

Leak 6 — Oversized EC2 / RDS instances (the sticky defaults)

The m5.2xlarge someone chose 18 months ago because "it's what we had in dev" now sits at 11% CPU and 28% memory utilisation in production. Rightsizing from m5.2xlarge to m5.large — a 4x reduction — is ₹18,000/month per instance. Mid-market clusters have 4–9 candidates for this move.

AWS Compute Optimizer gives you a free, account-wide rightsizing recommendation — it's already running in your account if Trusted Advisor is enabled. Most teams have never looked at it. When we pull the report, typical findings on a ₹5L bill: ₹35,000–₹75,000/month across EC2, RDS, and ASG members.

aws compute-optimizer get-ec2-instance-recommendations \
--region ap-south-1 \
--query 'instanceRecommendations[?finding==Overprovisioned]' \
--output table
Enter fullscreen mode Exit fullscreen mode




Leak 7 — Stale snapshots (the archive that never ends)

The least glamorous leak. EBS snapshots older than their retention policy, AMIs from machine images nobody deployed in 2 years, and RDS manual snapshots from debugging sessions that finished 8 months ago. EBS snapshot storage costs ₹4.85/GB-month in ap-south-1. A terabyte of stale snapshots is ₹60,000/year.

Typical finding: 1.5–4 TB of snapshots older than 180 days that have no associated retention policy. Recovery on ₹5L bill: ₹4,000–₹12,000/month.

Fix: Data Lifecycle Manager (DLM) policies for automated snapshot retention, and a one-time cleanup pass for anything created before DLM existed.

The consolidated ₹ impact table

Leak Monthly recovery (₹5L bill) % of bill
1. NAT Gateway processing ₹15,000–₹28,000 3–5.6%
2. Orphaned EBS ₹8,000–₹15,000 1.6–3%
3. Idle RDS ₹12,000–₹24,000 2.4–4.8%
4. Inter-region egress ₹6,000–₹18,000 1.2–3.6%
5. Missing Savings Plans ₹50,000–₹1,10,000 10–22%
6. Oversized compute ₹35,000–₹75,000 7–15%
7. Stale snapshots ₹4,000–₹12,000 0.8–2.4%
Total recoverable ₹1,30,000–₹2,82,000 26–56%

The totals above are additive ceilings — real audits rarely hit every leak at max simultaneously, which is why we quote "18–34% recoverable" as the blended real-world figure.

A real pattern study (names anonymised)

A ₹6.4 lakh/month AWS bill for a Series B Indian SaaS with a 55-person engineering team. Primary region ap-south-1, secondary us-east-1 for analytics and ML training. When we ran the diagnostic in March 2026, the leak breakdown came back as follows:

  • NAT Gateway processing: ₹26,400/month — three NAT Gateways, median 420 GB/day, with CloudWatch Logs traffic routed internet-bound instead of through VPC endpoints.
  • Orphaned EBS: ₹11,800/month — 29 volumes totalling 3.1 TB in available state, oldest 14 months.
  • Idle RDS: ₹18,200/month — two db.r5.large instances from a discontinued reporting service, still running in the prod account.
  • Inter-region egress: ₹41,600/month — unbatched Kinesis Firehose pushing per-event records cross-region. Switching to 15-minute batches + gzip compression cut this to ₹6,800 post-remediation.
  • Missing Savings Plans: ₹88,000/month — Savings Plan coverage at 22%; remediation targeted 78% coverage with a 3-year no-upfront commitment sized against a 12-month baseline, not current consumption.
  • Oversized compute: ₹47,500/month — four m5.2xlarge workers at 9% CPU rightsized to m5.large; one RDS r5.xlarge to r5.large.
  • Stale snapshots: ₹7,900/month — 2.4 TB of EBS snapshots predating the team's DLM adoption.

Total identified: ₹2,41,400/month — 37.7% of the monthly bill. Verified recovery in the first 90 days after implementation: ₹1,96,000/month (81% of the identified total; the rest was held back because two workloads were slated for architectural refactor in Q2). The customer paid ₹75,000 for the FinOps QuickStart and is now on a ₹50,000/month gain-share retainer. Payback on the engagement was 12 days.

This is not an edge case. It is the modal pattern for Indian mid-market AWS accounts that have never had a formal FinOps function, which is most of them.

What a proper AWS cost audit actually produces

A credible audit is not a Cost Explorer screenshot with highlighter marks on it. Our Cost module and free 24-hour audit deliver three artefacts:

  1. A leak-by-leak ₹ table for your specific account — not generic ranges, real numbers pulled from your Cost and Usage Report.
  2. A prioritised remediation plan — what to fix this week (low risk, high ₹), this quarter (governance), and this year (architecture-level).
  3. A measurable savings commitment — if you engage us for FinOps QuickStart after the audit, we commit to a verified savings figure, measured against a frozen baseline.

Free 24-hour AWS cost audit

We run this audit for Indian mid-market AWS accounts (₹3L–₹50L/month spend) in 24 hours, with read-only access, for free. No call required. You get a written report by email within one business day of submitting your billing export.

Start your free AWS cost audit → aicloudstrategist.com/audit.html

Or if you'd rather talk first, book a 30-minute Cloud Cost Health Check call.


Founder-led by Anushka B. AICloudStrategist is a founding-cohort FinOps consultancy for Indian mid-market companies (₹5L–₹50L/month cloud spend). First three customers at ₹40,000 for a full FinOps QuickStart. We publish our numbers honestly — including the ones that don't yet exist. See how we prove what we claim.

AICloudStrategist · Founder-led. Enterprise-reviewed. · Written by Anushka B, Founder.

Related writing

Top comments (0)