DEV Community

Cover image for Your Docker Registry Has 500 Unused Images Costing You Every Month 🐳
Suhas Mallesh
Suhas Mallesh

Posted on • Edited on

Your Docker Registry Has 500 Unused Images Costing You Every Month 🐳

ECR charges per GB for every Docker image ever pushed. Here's how to auto-delete old images with Terraform lifecycle policies and reclaim 80% of that storage.

tags: aws, terraform, docker, devops

Quick question: How many Docker images are in your ECR registry right now?

If you've been pushing images from CI/CD for the past year, the answer is probably hundreds or thousands.

Here's the problem:

ECR storage: $0.10/GB-month
Your images: 500 images Γ— 500MB avg = 250GB
Monthly cost: $25
Annual cost: $300

And it grows every single day.
Enter fullscreen mode Exit fullscreen mode

You're literally paying for every failed build, every test branch, every "quick-fix" from 6 months ago.

Most teams only use the last 5-10 images. The rest? Dead weight costing you money.

Let me show you how to automatically clean up old images with Terraform and recover 70-90% of that storage cost.

πŸ’Έ The ECR Storage Problem

ECR pricing is simple: $0.10/GB per month

But Docker images add up fast:

Typical Node.js app:

  • Base image: ~200MB
  • With dependencies: ~500MB
  • Builds per day: 10 (main + PR branches)
  • Days per year: 365
  • Total images/year: 3,650 images
  • Storage/year: 1,825GB
  • Annual cost: $2,190 πŸ’°

And that's just one application.

What you actually need:

  • Latest 10 production images: 5GB
  • Latest 5 staging images: 2.5GB
  • Total needed: 7.5GB
  • Should cost: $0.75/month

You're paying 30x more than necessary.

🎯 The Solution: ECR Lifecycle Policies

ECR has built-in lifecycle policies that auto-delete images based on:

  • Image age - Delete images older than X days
  • Image count - Keep only last N images
  • Tag status - Different rules for tagged vs untagged

One Terraform resource. Set it once. Forget it forever.

πŸ› οΈ Terraform Implementation

Basic Lifecycle Policy (Keep Last 10 Images)

# ecr-lifecycle.tf

resource "aws_ecr_repository" "app" {
  name                 = "my-app"
  image_tag_mutability = "MUTABLE"

  image_scanning_configuration {
    scan_on_push = true
  }

  tags = {
    Name = "my-app"
  }
}

resource "aws_ecr_lifecycle_policy" "app" {
  repository = aws_ecr_repository.app.name

  policy = jsonencode({
    rules = [
      {
        rulePriority = 1
        description  = "Keep last 10 images"
        selection = {
          tagStatus   = "any"
          countType   = "imageCountMoreThan"
          countNumber = 10
        }
        action = {
          type = "expire"
        }
      }
    ]
  })
}
Enter fullscreen mode Exit fullscreen mode

Deploy it:

terraform apply

# ECR will automatically delete images beyond the 10 most recent
# Runs daily, no manual intervention needed
Enter fullscreen mode Exit fullscreen mode

Savings from this alone: 70-90% of storage costs! πŸŽ‰

Advanced: Multi-Environment Lifecycle Policy

# ecr-lifecycle-advanced.tf

resource "aws_ecr_lifecycle_policy" "multi_env" {
  repository = aws_ecr_repository.app.name

  policy = jsonencode({
    rules = [
      # Rule 1: Keep last 20 production images (tagged)
      {
        rulePriority = 1
        description  = "Keep last 20 production images"
        selection = {
          tagStatus     = "tagged"
          tagPrefixList = ["prod-", "v"]
          countType     = "imageCountMoreThan"
          countNumber   = 20
        }
        action = {
          type = "expire"
        }
      },
      # Rule 2: Keep last 10 staging images
      {
        rulePriority = 2
        description  = "Keep last 10 staging images"
        selection = {
          tagStatus     = "tagged"
          tagPrefixList = ["staging-"]
          countType     = "imageCountMoreThan"
          countNumber   = 10
        }
        action = {
          type = "expire"
        }
      },
      # Rule 3: Delete untagged images older than 7 days
      {
        rulePriority = 3
        description  = "Delete untagged images after 7 days"
        selection = {
          tagStatus   = "untagged"
          countType   = "sinceImagePushed"
          countUnit   = "days"
          countNumber = 7
        }
        action = {
          type = "expire"
        }
      },
      # Rule 4: Keep only last 5 dev/feature branch images
      {
        rulePriority = 4
        description  = "Keep last 5 dev images"
        selection = {
          tagStatus     = "tagged"
          tagPrefixList = ["dev-", "feature-"]
          countType     = "imageCountMoreThan"
          countNumber   = 5
        }
        action = {
          type = "expire"
        }
      },
      # Rule 5: Catch-all - delete any images older than 30 days
      {
        rulePriority = 5
        description  = "Delete any remaining images older than 30 days"
        selection = {
          tagStatus   = "any"
          countType   = "sinceImagePushed"
          countUnit   = "days"
          countNumber = 30
        }
        action = {
          type = "expire"
        }
      }
    ]
  })
}
Enter fullscreen mode Exit fullscreen mode

Production-Ready Module

# modules/ecr-with-lifecycle/main.tf

variable "repository_name" {
  description = "ECR repository name"
  type        = string
}

variable "prod_image_count" {
  description = "Number of production images to keep"
  type        = number
  default     = 20
}

variable "staging_image_count" {
  description = "Number of staging images to keep"
  type        = number
  default     = 10
}

variable "dev_image_count" {
  description = "Number of dev/feature images to keep"
  type        = number
  default     = 5
}

variable "untagged_days" {
  description = "Days to keep untagged images"
  type        = number
  default     = 7
}

resource "aws_ecr_repository" "this" {
  name                 = var.repository_name
  image_tag_mutability = "MUTABLE"

  encryption_configuration {
    encryption_type = "AES256"
  }

  image_scanning_configuration {
    scan_on_push = true
  }

  tags = {
    Name       = var.repository_name
    ManagedBy  = "terraform"
  }
}

resource "aws_ecr_lifecycle_policy" "this" {
  repository = aws_ecr_repository.this.name

  policy = jsonencode({
    rules = [
      {
        rulePriority = 1
        description  = "Keep last ${var.prod_image_count} production images"
        selection = {
          tagStatus     = "tagged"
          tagPrefixList = ["prod-", "release-", "v"]
          countType     = "imageCountMoreThan"
          countNumber   = var.prod_image_count
        }
        action = { type = "expire" }
      },
      {
        rulePriority = 2
        description  = "Keep last ${var.staging_image_count} staging images"
        selection = {
          tagStatus     = "tagged"
          tagPrefixList = ["staging-", "stage-"]
          countType     = "imageCountMoreThan"
          countNumber   = var.staging_image_count
        }
        action = { type = "expire" }
      },
      {
        rulePriority = 3
        description  = "Delete untagged images after ${var.untagged_days} days"
        selection = {
          tagStatus   = "untagged"
          countType   = "sinceImagePushed"
          countUnit   = "days"
          countNumber = var.untagged_days
        }
        action = { type = "expire" }
      },
      {
        rulePriority = 4
        description  = "Keep last ${var.dev_image_count} dev/feature images"
        selection = {
          tagStatus     = "tagged"
          tagPrefixList = ["dev-", "feature-"]
          countType     = "imageCountMoreThan"
          countNumber   = var.dev_image_count
        }
        action = { type = "expire" }
      }
    ]
  })
}

output "repository_url" {
  value = aws_ecr_repository.this.repository_url
}

output "repository_arn" {
  value = aws_ecr_repository.this.arn
}
Enter fullscreen mode Exit fullscreen mode

Usage

# main.tf

module "api_ecr" {
  source = "./modules/ecr-with-lifecycle"

  repository_name     = "api-service"
  prod_image_count    = 30
  staging_image_count = 15
  dev_image_count     = 5
  untagged_days       = 7
}

module "web_ecr" {
  source = "./modules/ecr-with-lifecycle"

  repository_name     = "web-frontend"
  prod_image_count    = 20
  staging_image_count = 10
  dev_image_count     = 3
  untagged_days       = 3
}

output "api_url" {
  value = module.api_ecr.repository_url
}
Enter fullscreen mode Exit fullscreen mode

πŸ“Š Before/After Comparison

Before Lifecycle Policies

Repository: api-service
Images: 847 total
  - prod-* : 234 images
  - staging-*: 198 images
  - feature-*: 312 images
  - untagged: 103 images

Total size: 423GB
Monthly cost: $42.30
Annual cost: $507.60
Enter fullscreen mode Exit fullscreen mode

After Lifecycle Policies

Repository: api-service
Images: 45 total
  - prod-* : 20 images (kept last 20)
  - staging-*: 10 images (kept last 10)
  - feature-*: 15 images (kept last 5 per branch)
  - untagged: 0 images (deleted after 7 days)

Total size: 22.5GB
Monthly cost: $2.25
Annual cost: $27

Savings: $480.60/year (95% reduction!) πŸŽ‰
Enter fullscreen mode Exit fullscreen mode

πŸ’‘ Pro Tips

1. Test Lifecycle Policies First

ECR has a preview feature:

# See what WOULD be deleted without actually deleting
aws ecr get-lifecycle-policy-preview \
  --repository-name api-service
Enter fullscreen mode Exit fullscreen mode

Unfortunately, Terraform doesn't support this directly. Start with generous retention:

prod_image_count = 50  # Start high
dev_image_count  = 20  # Start high

# Gradually decrease over time
Enter fullscreen mode Exit fullscreen mode

2. Tag Your Images Properly

Lifecycle policies work best with consistent tagging:

# In your CI/CD pipeline
docker tag myapp:latest $ECR_URL:prod-${GIT_SHA}
docker tag myapp:latest $ECR_URL:staging-${GIT_SHA}
docker tag myapp:latest $ECR_URL:feature-${BRANCH_NAME}-${GIT_SHA}
Enter fullscreen mode Exit fullscreen mode

3. Monitor Deletions

Set up CloudWatch alerts:

resource "aws_cloudwatch_metric_alarm" "ecr_deletions" {
  alarm_name          = "ecr-high-deletions"
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = 1
  metric_name         = "RepositoryPullCount"
  namespace           = "AWS/ECR"
  period              = 86400
  statistic           = "Sum"
  threshold           = 100

  dimensions = {
    RepositoryName = "api-service"
  }
}
Enter fullscreen mode Exit fullscreen mode

4. Exclude Critical Images

If you need to keep specific images forever:

# Option 1: Use a special tag prefix not in lifecycle rules
docker tag myapp:latest $ECR_URL:keep-forever-v1.0.0

# Option 2: Create separate repository for long-term images
module "releases_ecr" {
  source = "./modules/ecr-with-lifecycle"

  repository_name  = "api-service-releases"
  prod_image_count = 100  # Keep many more
}
Enter fullscreen mode Exit fullscreen mode

πŸŽ“ Common Tagging Strategies

Strategy 1: Environment-Based

prod-abc123
staging-abc123
dev-feature-xyz
Enter fullscreen mode Exit fullscreen mode

Works well with lifecycle rules based on tag prefix.

Strategy 2: Semantic Versioning

v1.2.3
v1.2.4-rc1
latest
Enter fullscreen mode Exit fullscreen mode

Good for release management, harder for lifecycle automation.

Strategy 3: Combined

prod-v1.2.3-abc123
staging-v1.2.3-abc123
Enter fullscreen mode Exit fullscreen mode

Best of both worlds - clear environment + version.

⚠️ Gotchas to Watch Out For

1. Rule Priority Matters

Rules are evaluated in order. More specific rules should have lower priority numbers:

rulePriority = 1  # Most specific (prod images)
rulePriority = 2  # Less specific (staging)
rulePriority = 5  # Catch-all (everything else)
Enter fullscreen mode Exit fullscreen mode

2. Untagged Images Pile Up Fast

Failed builds leave untagged images. Always have a cleanup rule:

{
  rulePriority = 99
  description  = "Cleanup untagged"
  selection = {
    tagStatus   = "untagged"
    countType   = "sinceImagePushed"
    countUnit   = "days"
    countNumber = 1  # Delete after 1 day
  }
  action = { type = "expire" }
}
Enter fullscreen mode Exit fullscreen mode

3. Images In Use Won't Delete

ECR won't delete images currently in use by ECS/EKS. This is good! But your count might be higher than expected.

πŸš€ Quick Start

# 1. Check current storage usage
aws ecr describe-repositories \
  --query 'repositories[*].[repositoryName]' \
  --output table

# For each repository, get image count
aws ecr list-images \
  --repository-name api-service \
  --query 'length(imageIds)'

# 2. Deploy lifecycle policy
terraform apply

# 3. Wait 24 hours (ECR runs cleanup daily)

# 4. Verify reduction
aws ecr list-images \
  --repository-name api-service \
  --query 'length(imageIds)'

# 5. Check cost savings next month πŸ’°
Enter fullscreen mode Exit fullscreen mode

πŸ“ˆ Real-World Impact

Startup with 5 microservices:

Before:

  • 5 repositories
  • Average 600 images each = 3,000 total
  • Average 500MB per image = 1,500GB
  • Monthly cost: $150
  • Annual cost: $1,800

After (lifecycle policies):

  • 5 repositories
  • Average 30 images each = 150 total
  • Total: 75GB
  • Monthly cost: $7.50
  • Annual cost: $90

Savings: $1,710/year (95% reduction!)

Implementation time: 30 minutes

Ongoing maintenance: Zero

🎯 Summary

The Problem:

  • ECR charges $0.10/GB per month
  • CI/CD pushes hundreds/thousands of images
  • Old images never get deleted
  • Storage costs grow indefinitely

The Solution:

  • ECR lifecycle policies (built-in feature)
  • Auto-delete based on age or count
  • Different rules per environment
  • Set once, runs forever

The Result:

  • Typical savings: 70-95% of ECR costs
  • Keep only what you need (last 5-20 images)
  • Zero ongoing maintenance
  • One Terraform resource

Stop paying for every Docker image you've ever built. Set lifecycle policies today and reclaim your storage. πŸš€


Implemented ECR lifecycle policies? How many images did you delete? Share in the comments! πŸ’¬

Follow for more AWS cost optimization with Terraform! ⚑

Top comments (0)