DEV Community

I Built a Go Tool That Found $4,200/yr of Wasted AWS Spend (Cost Explorer API), Part 1

Go Tool That Found $4,200/yr of Wasted AWS

Our AWS bill went up about 6% a quarter for a year and nobody could say why. No new product, no traffic spike, only a number drifting up and to the right. When I sat down with Cost Explorer to find the leak, I hit what makes AWS waste so sticky: the console shows you one resource at a time and never adds it up. A $9/month volume here, an idle instance there, a forgotten Elastic IP. Each one is too small to bother with, and no screen tells you they sum to $350 a month of nothing.

I built that screen. It's a small Go CLI called costsweep that walks the account, prices every dead resource it finds, and prints one number: the annual total. On our account that number came back at $4,204/year, and most of it was four categories I could have killed in an afternoon. The code is on GitHub, and this is the first of three posts building it from scratch.

This post is the foundation: the project layout, talking to the Cost Explorer API from Go, and the piece I care about most, an interface seam that lets me test the whole thing without touching a real AWS account.

The Problem, Stated Precisely

"Find wasted spend" is too vague to build. Let me narrow it to things that are (a) recurring charges, (b) safe to call waste with high confidence, and (c) cheap to detect through an API:

  1. EBS volumes in the available state, detached from any instance, still billing per GiB.
  2. Elastic IPs associated with nothing, pure rent on an address you reserved and forgot.
  3. Snapshots older than some threshold, not always waste, worth a look anyway.
  4. Idle and oversized EC2 instances, the expensive case, and the one AWS already computes for you.

That last category does the heavy lifting. AWS watches your instances' CloudWatch metrics for two weeks and produces rightsizing recommendations ("this box averaged 1% CPU, terminate it"), each with its own dollar estimate. Cost Explorer shows them in the console but never totals them. I started there, because it's the biggest line and the one place I can trust AWS's own math instead of pricing anything myself.

What I Looked At First

Before writing Go I checked what already exists. Trusted Advisor has cost checks, but the useful ones sit behind a Business support plan. Compute Optimizer overlaps with Cost Explorer's rightsizing. There are good commercial tools (CloudZero, Vantage, nOps), but they're SaaS you point at your account, not a 13MB binary you can read end to end and run in CI. The one Go example I found for the Cost Explorer API was a "teaching myself Go" repo that posted daily spend to Slack: close, but not a waste finder.

The gap was a small, auditable, self-hosted CLI. That's what this is. The Cost Explorer API is in the AWS SDK for Go v2 as service/costexplorer, and the call I want is GetRightsizingRecommendation.

The Shape of the Tool

Every scanner reduces its results to one type, whether it talks to Cost Explorer, EC2, or something I add later. That type is the contract between the half of the program that talks to AWS and the half that formats numbers:

// Finding is one piece of priced waste. Monthly stored, annual derived.
type Finding struct {
    Type       string   `json:"type"`        // e.g. "unattached-ebs"
    Resource   string   `json:"resource"`    // ARN or ID
    Region     string   `json:"region"`
    MonthlyUSD float64  `json:"monthly_usd"`
    Detail     string   `json:"detail"`
    Action     string   `json:"action"`
    Severity   Severity `json:"severity"`
}
Enter fullscreen mode Exit fullscreen mode

I store the estimate monthly, not annual, because that's how AWS bills and how the console reports. The number people react to is the annual one, so it's a derived method, not a stored field:

func (f Finding) AnnualUSD() float64 { return f.MonthlyUSD * 12 }
Enter fullscreen mode Exit fullscreen mode

That one line carries the tool's whole argument. A $9/month charge gets ignored. The same charge written as $108/year gets a cleanup ticket.

The packages follow from the Finding contract:

  • internal/finding: the type above and nothing else.
  • internal/scan: the scanners and the narrow interfaces they talk through.
  • internal/pricing: turns "100 GiB gp3 in us-east-1" into dollars (Part 2).
  • internal/report: sorts, totals, and renders (Part 3).

The Seam That Makes It Testable

One design decision holds the project together: the scanners don't take a *costexplorer.Client. They take an interface that lists only the calls they make:

// CostExplorerAPI is the CE client subset we use. CE is global (us-east-1).
type CostExplorerAPI interface {
    GetRightsizingRecommendation(context.Context, *costexplorer.GetRightsizingRecommendationInput, ...func(*costexplorer.Options)) (*costexplorer.GetRightsizingRecommendationOutput, error)
    GetCostAndUsage(context.Context, *costexplorer.GetCostAndUsageInput, ...func(*costexplorer.Options)) (*costexplorer.GetCostAndUsageOutput, error)
}
Enter fullscreen mode Exit fullscreen mode

The real *costexplorer.Client satisfies this for free; its methods already have these signatures. In a test, a fake with a single field satisfies it too. So I can feed the scanner a canned recommendation and assert on the Finding it produces, with no credentials, no network, and no spend. For a tool whose whole job is calling billed AWS APIs, offline tests separate code I trust from code I poke at and hope.

Every scanner implements the same tiny interface:

// Scanner is one source of waste. Returns errors (e.g. missing IAM) instead of
// panicking, so one failure degrades the run rather than killing it.
type Scanner interface {
    Name() string
    Scan(ctx context.Context) ([]finding.Finding, error)
}
Enter fullscreen mode Exit fullscreen mode

The Rightsizing Scanner

Now the Cost Explorer call. GetRightsizingRecommendation returns a list of recommendations, each tagged TERMINATE (kill this idle box) or MODIFY (drop it a size). The savings estimate lives in a different field depending on which type it is, the one wrinkle here. The scanner pages through the results and hands each to a flattener:

func (s *RightsizingScanner) Scan(ctx context.Context) ([]finding.Finding, error) {
    var out []finding.Finding
    var token *string
    for {
        resp, err := s.Client.GetRightsizingRecommendation(ctx, &costexplorer.GetRightsizingRecommendationInput{
            Service:       aws.String("AmazonEC2"),
            NextPageToken: token,
        })
        if err != nil {
            return nil, err
        }
        for _, rec := range resp.RightsizingRecommendations {
            out = append(out, recToFinding(rec))
        }
        if resp.NextPageToken == nil || *resp.NextPageToken == "" {
            break
        }
        token = resp.NextPageToken
    }
    return out, nil
}
Enter fullscreen mode Exit fullscreen mode

Service must be "AmazonEC2"; rightsizing covers only EC2 today. The flattener normalizes the two recommendation shapes. For rightsizing I price nothing myself: I pass through AWS's own monthly savings number, computed from the same data that generates your invoice:

func recToFinding(rec cetypes.RightsizingRecommendation) finding.Finding {
    resource := "unknown-instance"
    if rec.CurrentInstance != nil && rec.CurrentInstance.ResourceId != nil {
        resource = *rec.CurrentInstance.ResourceId
    }

    switch rec.RightsizingType {
    case cetypes.RightsizingTypeTerminate:
        var savings float64
        if d := rec.TerminateRecommendationDetail; d != nil {
            savings = parseUSD(d.EstimatedMonthlySavings)
        }
        return finding.Finding{
            Type:       "rightsizing-terminate",
            Resource:   resource,
            MonthlyUSD: savings,
            Detail:     "Cost Explorer flags this instance as idle (terminate candidate)",
            Action:     "confirm it's unused, then terminate",
            Severity:   finding.High,
        }
    default: // MODIFY
        var savings float64
        if d := rec.ModifyRecommendationDetail; d != nil && len(d.TargetInstances) > 0 {
            savings = parseUSD(d.TargetInstances[0].EstimatedMonthlySavings)
        }
        return finding.Finding{
            Type:       "rightsizing-modify",
            Resource:   resource,
            MonthlyUSD: savings,
            Detail:     "Cost Explorer recommends a smaller instance type",
            Action:     "downsize to the recommended type after a load check",
            Severity:   finding.Medium,
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

One AWS quirk: Cost Explorer returns money as strings, like "138.7000000000", not floats. So a tiny parser turns a blank or unparseable value into 0 rather than an error. A recommendation with no savings estimate still surfaces; it shouldn't crash the run or inflate the total:

func parseUSD(p *string) float64 {
    if p == nil {
        return 0
    }
    v, err := strconv.ParseFloat(*p, 64)
    if err != nil {
        return 0
    }
    return v
}
Enter fullscreen mode Exit fullscreen mode

Wiring the Client

I build the real client once, in main. Remember that Cost Explorer is a global service that answers only in us-east-1, wherever your instances live. Build it against eu-west-1 and you get an endpoint error, so I pin the region:

cfg, err := awsconfig.LoadDefaultConfig(ctx, awsconfig.WithRegion(region))
// ...
cec := costexplorer.NewFromConfig(cfg, func(o *costexplorer.Options) {
    o.Region = "us-east-1"
})
scanners := []scan.Scanner{
    &scan.RightsizingScanner{Client: cec},
    // resource scanners join here in Part 2
}
Enter fullscreen mode Exit fullscreen mode

LoadDefaultConfig picks up credentials the way the AWS CLI does (environment variables, ~/.aws/credentials, SSO, instance roles), so there's no credential handling code to get wrong.

Test Output

The interface seam pays off in the test. The terminate-path test builds a fake Cost Explorer client from a struct literal, hands the scanner one recommendation with "30.00" of savings, and asserts on the Finding:

func TestRightsizingScannerTerminate(t *testing.T) {
    cec := &fakeCE{recs: []cetypes.RightsizingRecommendation{{
        RightsizingType: cetypes.RightsizingTypeTerminate,
        CurrentInstance: &cetypes.CurrentInstance{ResourceId: aws.String("i-1234idle")},
        TerminateRecommendationDetail: &cetypes.TerminateRecommendationDetail{
            EstimatedMonthlySavings: aws.String("30.00"),
        },
    }}}
    s := &RightsizingScanner{Client: cec}
    got, _ := s.Scan(context.Background())
    if got[0].MonthlyUSD != 30.0 {
        t.Errorf("monthly = %.2f, want 30.00", got[0].MonthlyUSD)
    }
}
Enter fullscreen mode Exit fullscreen mode

Running the suite, no AWS account in sight:

$ go test ./internal/scan/ -v
=== RUN   TestRightsizingScannerTerminate
--- PASS: TestRightsizingScannerTerminate (0.00s)
=== RUN   TestRunCollectsAcrossScanners
--- PASS: TestRunCollectsAcrossScanners (0.00s)
PASS
ok      github.com/rezmoss/costsweep/internal/scan  0.247s
Enter fullscreen mode Exit fullscreen mode

To see the end-to-end shape before the other scanners exist, I bundled a sample dataset into the binary behind a -demo flag (more on that in Part 3). It reads the same Finding type the live path produces, so the output is identical to a real run:

$ costsweep -demo
TYPE                   RESOURCE             REGION          MONTHLY     ANNUAL
------------------------------------------------------------------------------
rightsizing-terminate  i-04e1f7a9c2b3d5e60  us-east-1       138.70      1664
rightsizing-modify     i-09a2bc4d6e8f1a2b3  us-east-1        69.35       832
...
------------------------------------------------------------------------------
TOTAL (13 findings)                                         350.30      4204
Enter fullscreen mode Exit fullscreen mode

That's $4,204/year, the number that started this, reproducible from one command.

Next Step

The rightsizing scanner is the biggest line, but it covers only EC2, and only instances AWS has two weeks of metrics on. The cheap, certain waste (detached volumes, idle IPs, forgotten snapshots) lives in the EC2 API, and AWS won't price it for me. Next I'll write those three resource scanners against the same interface seam, build the pricing table that turns "200 GiB gp2" into a dollar figure, and push the total up.

The full tool is on GitHub. Clone it and run costsweep -demo to see the whole thing before we've finished building it.

Top comments (0)