Matt

Posted on Jun 19 • Edited on Jun 30 • Originally published at fortem.dev

How to Optimize AWS ECS Costs Beyond Reserved Instances

#aws #ecs #fargate #finops

AWS ECS Cost Optimization Beyond Spot and Savings Plans

Originally published at https://fortem.dev/blog/aws-cost-optimization-ecs
Spot and Savings Plans cover the first 30%. Five more levers most ECS teams miss: Graviton, VPC endpoints, Container Insights scoping, shared ALBs, Compute Optimizer.

Guide

TL;DR

Fargate compute is only half your ECS bill. ALBs, NAT Gateway, and Container Insights account for 30–52% of total spend in verified fleet benchmarks.
The S3 gateway endpoint is free — add it today. Every container image pull currently routes through NAT at $0.045/GB unless you have one.
ARM64/Graviton Fargate is $0.03238 vs $0.04048/vCPU-hr — a flat 20% reduction on all compute, no architectural change required.
AWS Compute Optimizer covers Fargate (free since Dec 2022). One CLI command returns right-sizing recommendations for every service in your fleet.

Ready to use — copy this today

Three commands you can run right now — before changing a single task definition.

1. Add the free S3 gateway endpoint (eliminates NAT charges on ECR image pulls)

aws ec2 create-vpc-endpoint \
  --vpc-id $VPC_ID \
  --service-name com.amazonaws.us-east-1.s3 \
  --route-table-ids $RTB_ID \
  --type Gateway

2. Pull Compute Optimizer recommendations fleet-wide

aws compute-optimizer get-ecs-service-recommendations \
  --query 'ecsServiceRecommendations[*].{
    Service:serviceArn,
    Finding:finding,
    vCPU:recommendationOptions[0].containerRecommendations[0].containerName
  }' \
  --output table

3. Check Container Insights status per cluster

aws ecs list-clusters | jq -r '.clusterArns[]' | \
  xargs -I{} aws ecs describe-clusters \
    --clusters {} \
    --include SETTINGS \
    --query 'clusters[0].{
      Name:clusterName,
      Insights:settings[?name==`containerInsights`].value|[0]
    }' \
  --output table

You've done Spot and scheduling. Here's where the next 30% hides.

Fargate compute is only half the ECS bill. Teams that stop at Spot and scheduling still pay $0.045/GB in NAT data processing, $0.07/metric/month for Container Insights they forgot was on, and 20% more per vCPU than Graviton would cost. Verified average: compute-only estimates undercount total spend by 30–52%.

CloudBurn ran the numbers on a real Fargate fleet and found compute-only estimates of $181.77 against an actual bill of $276.27 — a 52% gap driven entirely by ALB base charges, NAT data processing, and CloudWatch metrics. The gap compounds across environments: 10 environments that look like a $1,800/month Fargate bill are actually closer to $2,700. The full Fargate pricing breakdown itemizes every one of these charges.

This article picks up where Spot and scheduling leave off. If you haven't covered those yet, start with how to cut Fargate compute costs with Spot and scheduling — those two moves alone cut 60–70% before touching anything else. Come back here for the second layer.

The five levers below — NAT/VPC endpoints, Graviton, Container Insights, ALB consolidation, and Compute Optimizer — are independent. You can apply any one of them this week without touching the others. Each section includes the dollar math for a 10-service fleet so you can rank them by impact before you start.

NAT Gateway is quietly billing $0.045/GB on every image pull

Every container image pull and AWS API call from a private subnet runs through NAT at $0.045/GB data processing. A 403MB image pulled 32k times (crash-looping health check) costs ~$566 via NAT — $0.35 with an S3 gateway endpoint. The gateway endpoint is free. Add it before anything else.

The crash-loop story is instructive. One team deployed a container with a health check misconfiguration — the task started, failed, restarted, and repeated 32,000 times over several days. The 403MB image pulled each time at $0.045/GB NAT data processing: 403 MB × 32,000 pulls ÷ 1,024 = 12,594 GB × $0.045 = ~$567. With the free S3 gateway endpoint routing ECR layer pulls through AWS backbone instead of NAT, the same traffic costs $0.35. The endpoint takes 90 seconds to add.

Option	Hourly	Data processing
NAT Gateway	$0.045/hr	$0.045/GB
Interface endpoint (ECR, CW, etc.)	$0.01/hr/AZ	$0.01/GB
Gateway endpoint (S3, DynamoDB)	Free	Free

The nuance that trips teams up: interface endpoints are not always cheaper than NAT. One interface endpoint costs $0.01/hr/AZ = ~$7.20/month per AZ. Add the five endpoints typically required for private ECS tasks (ECR API, ECR DKR, CloudWatch Logs, Secrets Manager, STS) across 3 AZs: that's 5 × 3 × $7.20 = $108/month in endpoint hourly charges alone — before counting data processing. The fourtheorem team documented exactly this: a setup that looked cheaper with endpoints ($43.84) flipped to $197/month once all required endpoints were counted, versus $100/month with NAT.

“Each service that is not deployable to a VPC requires a new VPC Endpoint… the bills stack up quickly!”

— fourtheorem, Amazon ECS Hidden Costs

KEY INSIGHT: The S3 gateway endpoint is always free and takes 90 seconds to add — do it unconditionally. Interface endpoints require explicit break-even math: calculate monthly NAT data charges vs. (number of endpoints × AZs × $7.20). At low data volumes, NAT is cheaper.

Check whether you already have the S3 gateway endpoint in place:

aws ec2 describe-vpc-endpoints \
  --filters \
    "Name=service-name,Values=com.amazonaws.us-east-1.s3" \
    "Name=vpc-endpoint-type,Values=Gateway" \
  --query 'VpcEndpoints[*].{State:State,VPC:VpcId}'

Empty output means you don't have one. The create command is in the Ready-to-use block above.

Switch to Graviton and take 20% off every Fargate task

ARM64 Fargate costs $0.03238/vCPU-hr vs $0.04048 for x86 — exactly 20% less, same memory pricing. For 10 services × 3 tasks × 2 vCPU running 730 hours, that's $142/month saved with no infrastructure change — just rebuild images for linux/arm64.

The math is clean. For a 10-service fleet where each service runs 3 tasks at 2 vCPU:

Graviton savings — 10-service fleet

x86: 10 × 3 × 2 vCPU × $0.04048 × 730 hr = $1,773/mo

ARM64: 10 × 3 × 2 vCPU × $0.03238 × 730 hr = $1,418/mo

Savings: $355/mo · $4,257/yr

That's at 2 vCPU per task. At 0.5 vCPU (smaller tasks), the same formula yields ~$89/month — still worth it for zero architectural work. Memory pricing is identical between x86 and ARM64, so all savings come from compute.

KEY INSIGHT: Graviton + Spot is the highest-impact combination on Fargate. The 20% Graviton discount stacks with the ~70% Spot discount for a combined ~76% reduction versus x86 on-demand. For dev environments that already run on Spot, switching to ARM64 pulls another 20% out of the remaining compute spend.

What needs to change: rebuild Docker images with docker buildx build --platform linux/arm64, then update the ECS task definition's runtimePlatform.cpuArchitecture to ARM64. Most Python, Node.js, Go, Java, and Ruby stacks rebuild without any code changes. Test before flipping production — native code compiled for x86, some C-extension Python packages, and kernel-module-dependent workloads need verification.

For the full Spot setup and capacity provider strategy, see how to cut ECS Fargate costs by 65%. Graviton and Spot configure independently — you can add Graviton to an existing Spot setup today.

Container Insights Enhanced: $0.07/metric/month, and it multiplies with fleet size

Enhanced Container Insights for ECS charges $0.07 per metric per month. AWS's own example: 1 cluster, 5 services, 20 tasks, 50 containers = 2,264 metrics = $158.48/month. At 10 environments that's over $1,500/month for observability you may not be actively using.

The metric count compounds with fleet size because each container generates multiple metrics: CPU utilization, memory utilization, network bytes in/out, storage read/write, task count. AWS's CloudWatch pricing page gives the worked example directly: 1 cluster with 5 services, 20 tasks, 50 containers generates 2,264 metrics. At $0.07 each: $158.48/month per cluster.

Container Insights fleet math

1 cluster (50 containers): 2,264 metrics × $0.07 = $158.48/mo

5 clusters (dev + staging + prod × regions): ~$792/mo

10 clusters (multi-environment fleet): ~$1,585/mo

Fix: enable Enhanced on prod clusters only → $158/mo vs $1,585/mo

Enhanced Container Insights is opt-in per cluster — it is not on by default for all clusters. The problem is that teams often enable it when debugging a production issue and never disable it on the dev/staging clusters where they also turned it on. Standard CloudWatch metrics (CPU, memory at task level) still work without Enhanced — they're the base tier and are billed differently. Enhanced adds per-container-level metrics, ECS-specific dimensions, and storage metrics.

Check which clusters have Enhanced enabled, then disable it on non-production clusters:

# Check current status per cluster
aws ecs describe-clusters \
  --clusters YOUR_CLUSTER_NAME \
  --include SETTINGS \
  --query 'clusters[0].settings'

# Disable Enhanced on a cluster
aws ecs update-cluster-settings \
  --cluster YOUR_CLUSTER_NAME \
  --settings name=containerInsights,value=disabled

For controlling CloudWatch log costs across ECS at fleet scale, see controlling CloudWatch log costs for ECS— that covers log group retention, FireLens vs awslogs, and the log-volume math by service type.

One ALB per environment, not one per service

Each ALB costs $16–20/month in base hourly charges before LCUs. Teams that provision one ALB per microservice per environment run 50–100+ ALBs. One team reduced from 270 ALBs to 9 by switching to host-based routing — one ALB per environment, listener rules route to services.

The Signiant engineering team documented their ALB consolidation in detail: they had 270 ALBs across their infrastructure, running at roughly $16–20 each per month. Switching to a shared ALB model with host-based routing ( *.service.env.internal → listener rules → target groups) reduced that to 9 ALBs — 261 base charges eliminated. At $18/month average: $4,698/month removed from the bill.

The anti-pattern is common: a Terraform module creates one ECS service and one ALB together, so teams end up with an ALB per service per environment by default. Shared ALBs require slightly more routing configuration but the cost argument is clear past 3–4 services per environment.

ALB consolidation also reduces IPv4 address charges. Since February 2024, AWS charges $0.005/hr per public IPv4 address — each ALB typically holds one. At 270 ALBs: 270 × $0.005 × 730 hr = $985/month in IPv4 charges alone.

A shared ALB setup in Terraform:

# Shared ALB — one per environment
resource "aws_lb" "env" {
  name               = "${var.env_name}-alb"
  internal           = false
  load_balancer_type = "application"
  subnets            = var.public_subnet_ids
  security_groups    = [aws_security_group.alb.id]
}

resource "aws_lb_listener" "https" {
  load_balancer_arn = aws_lb.env.arn
  port              = 443
  protocol          = "HTTPS"
  ssl_policy        = "ELBSecurityPolicy-TLS13-1-2-2021-06"
  certificate_arn   = var.acm_cert_arn

  default_action {
    type = "fixed-response"
    fixed_response {
      content_type = "text/plain"
      message_body = "no route"
      status_code  = "404"
    }
  }
}

# Per-service listener rule — host-based routing
resource "aws_lb_listener_rule" "api" {
  listener_arn = aws_lb_listener.https.arn
  priority     = 100

  action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.api.arn
  }

  condition {
    host_header {
      values = ["api.${var.env_name}.example.com"]
    }
  }
}

resource "aws_lb_listener_rule" "worker" {
  listener_arn = aws_lb_listener.https.arn
  priority     = 110

  action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.worker.arn
  }

  condition {
    host_header {
      values = ["worker.${var.env_name}.example.com"]
    }
  }
}

Caveat: WebSocket services and services that require conflicting port bindings may need their own ALB. Otherwise, one ALB per environment handles up to 100 listener rules (the default soft limit, extendable via quota request).

Compute Optimizer runs a free fleet right-sizing pass — and it's scriptable

AWS Compute Optimizer has supported Fargate since December 2022 at no charge. get-ecs-service-recommendations returns CPU and memory recommendations at both task and container level. Script it across all services and diff against current task definitions to find over-provisioned tasks fleet-wide.

Compute Optimizer launched ECS Fargate support on December 23, 2022 — it's free, and most teams haven't used it. The tool analyzes CloudWatch utilization metrics from the trailing 14 days and returns recommendations at two levels: the task definition (overall CPU/memory) and the individual container (container-level CPU/memory shares within the task). For over-provisioned long-running services, the claimed savings are 30–70% of compute spend.

The gotcha that wastes time:Compute Optimizer won't generate recommendations for a service if a target-tracking Auto Scaling policy is attached to CPU or memory for that service. If you check a service and get no recommendations, verify whether it has an ASG policy. Recommendations require at least 24 hours of CloudWatch and ECS utilization data in the trailing 14-day window.

Fleet script — loop over all clusters and return recommendations for every service:

#!/bin/bash
# Get Compute Optimizer recommendations for all ECS services
# Requires: AWS CLI v2, jq

REGION="${AWS_DEFAULT_REGION:-us-east-1}"

echo "Fetching all clusters..."
CLUSTERS=$(aws ecs list-clusters --region "$REGION" --query 'clusterArns[]' --output text)

for CLUSTER in $CLUSTERS; do
  CLUSTER_NAME=$(basename "$CLUSTER")
  echo ""
  echo "=== $CLUSTER_NAME ==="

  SERVICES=$(aws ecs list-services     --cluster "$CLUSTER"     --region "$REGION"     --query 'serviceArns[]'     --output text)

  for SERVICE_ARN in $SERVICES; do
    RESULT=$(aws compute-optimizer get-ecs-service-recommendations       --service-arns "$SERVICE_ARN"       --region "$REGION"       --query 'ecsServiceRecommendations[0].{
        Finding: finding,
        CurrentCPU: currentServiceConfiguration.cpu,
        CurrentMem: currentServiceConfiguration.memory,
        RecommendedCPU: recommendationOptions[0].containerRecommendations[0].memorySizeConfiguration.cpu,
        RecommendedMem: recommendationOptions[0].containerRecommendations[0].memorySizeConfiguration.memory
      }'       --output json 2>/dev/null)

    if [ -n "$RESULT" ] && [ "$RESULT" != "null" ]; then
      SERVICE_NAME=$(basename "$SERVICE_ARN")
      echo "$SERVICE_NAME: $RESULT" | jq -c .
    fi
  done
done

To use this in CI: run the script, parse the JSON output, and compare recommended vs current CPU/memory in each task definition. Flag as a PR comment or Slack alert when drift exceeds 20%. Teams that do this catch over-provisioning at deploy time before it accumulates.

“If you've estimated your ECS costs based only on Fargate compute pricing, you're probably underestimating by 30–50%.”

— CloudBurn, AWS Fargate Pricing: Real Costs

If you read this, you might also want to know

How do I know if my ECS tasks are right-sized without Compute Optimizer?

Check CloudWatch Container Insights → CPU and Memory Utilization per service. Look at p95 over a 2-week window. If p95 CPU stays below 30% of the task allocation, drop to the next Fargate size (e.g. 1 vCPU → 0.5 vCPU). If p95 memory stays below 40%, halve the memory allocation. Standard Container Insights (not Enhanced) gives you this at the task level for free.

Can I mix x86 and ARM64 tasks in the same ECS cluster?

Yes — ECS clusters are architecture-agnostic. Each task definition specifies its own runtimePlatform.cpuArchitecture. You can migrate services one at a time: flip a dev task definition to ARM64, test for a week, then move staging and production. No cluster changes required.

How do I set up AZ-aware task placement to avoid cross-AZ charges?

Use the spread placement strategy with field=attribute:ecs.availability-zone to distribute tasks across AZs, and ensure your target groups use the same AZ as the tasks routing through them. Cross-AZ traffic in the same region costs $0.01/GB each direction. For high-throughput services, co-locating the ALB target group and the task in the same AZ eliminates this charge.

What happens to in-flight requests when a Fargate Spot task is interrupted?

ECS sends SIGTERM to the container and fires an EventBridge event with stopCode SPOT_INTERRUPTION. You have up to 2 minutes before the task is forcibly stopped. Set stopTimeout to 120 seconds in your task definition to use the full window. Configure your ALB to deregister the target before SIGTERM (the connection draining period handles this). In-flight requests that complete within the draining window are served normally.

Common questions

Are VPC endpoints always cheaper than NAT Gateway for ECS?

Not always. The free S3 gateway endpoint is always worth adding — it cuts NAT costs on ECR image pulls at zero cost. Interface endpoints for ECR API, CloudWatch Logs, and Secrets Manager each cost ~$7.20/month per AZ. If you need all five common endpoints in 3 AZs, that's ~$108/month — which can exceed your NAT costs at low data volumes. Run the math: monthly NAT data charges vs. (number of endpoints × AZs × $7.20). The break-even is roughly 1,200 GB/month of relevant traffic per interface endpoint.

Does AWS Compute Optimizer work with ECS Fargate?

Yes — Compute Optimizer has covered Fargate since December 23, 2022 at no charge. It recommends CPU and memory at both task and container level. Requirements: at least 24 hours of CloudWatch and ECS utilization data in the trailing 14 days. One gotcha: it won't generate recommendations if a target-tracking Auto Scaling policy is attached to CPU or memory for the service.

How much does Container Insights cost for ECS?

Enhanced Container Insights charges $0.07 per metric per month, prorated hourly. AWS's own example for a single cluster with 5 services, 20 tasks, and 50 containers: 2,264 metrics = $158.48/month. Standard CloudWatch custom metrics are a separate charge. Enhanced is opt-in per cluster — check which clusters have it enabled with: aws ecs describe-clusters --clusters CLUSTER --include SETTINGS

How do I migrate ECS Fargate tasks to Graviton/ARM64?

Rebuild your container images for linux/arm64 using Docker Buildx: docker buildx build --platform linux/arm64 -t myimage:arm64 . Then update your ECS task definition to set runtimePlatform.cpuArchitecture to ARM64 and runtimePlatform.operatingSystemFamily to LINUX. Most interpreted stacks (Python, Node.js, Go, Java, Ruby) rebuild without code changes. Test binaries compiled for x86 — they won't run on ARM64 without recompilation.

What is the Fargate Spot interruption rate?

AWS does not publish a specific Fargate Spot interruption rate. In practice, Spot interruptions are infrequent for most workloads — typically single-digit percentage of tasks per month in stable capacity regions, though this varies by AZ and instance family demand. ECS gives a 2-minute SIGTERM warning via both the container signal and an EventBridge event (aws.ecs task-state-change with stopCode SPOT_INTERRUPTION). Set stopTimeout to 120 seconds or less in your task definition to handle the full window.

Worth reading

ToolAWS ECS Fargate Cost CalculatorCalculate your real Fargate bill before and after optimization. Enter fleet size, see the dollar impact of scheduling.Guide · How to Cut AWS ECS Fargate Costs by 65%Scheduling, right-sizing, Spot, and orphaned environments — the four methods that take a 12-environment fleet from $1,730 to $380/month.

See your real per-env cost: fortem.dev/ecs-cost-calculator

DEV Community