You push one line of code.
CodeBuild kicks off. You grab a coffee. Come back. It's still building.
Five minutes later, the same five minutes it took yesterday, and the day before, your Docker image finally finishes. And the frustrating part? You only changed a comment in one file.
I've been there. It drove me crazy until I figured out the fix.
In this tutorial, I'll show you how to use Amazon ECR as a persistent Docker layer cache so CodeBuild only rebuilds what actually changed. One afternoon of setup. 3–4 minutes shaved off every single build after that.
Let's fix this thing.
First: why is my Docker build so slow in CI?
Before we fix it, let me quickly explain why this happens. Because once you understand it, the solution makes total sense.
Think of Docker layers like pancakes
Every line in your Dockerfile creates a new "layer", like stacking pancakes. FROM, RUN, COPY each one adds a layer on top.
FROM node:20-alpine # Layer 1
RUN apt-get install ... # Layer 2
COPY package.json ./ # Layer 3
RUN npm install # Layer 4
COPY . . # Layer 5
RUN npm run build # Layer 6
On your laptop, Docker is smart about this. If a layer hasn't changed since last time, it reuses the saved version. No need to reinstall 300 npm packages if package.json didn't change. This is called layer caching and it makes local builds super fast.
New to Docker? Layer caching is basically Docker saying "I've done this step before and nothing changed, so I'll skip it." The problem is, in CI, Docker has no memory of what it did last time.
The CI problem: every build starts from zero
Here's what happens every time you push code to AWS CodeBuild:
CodeBuild spins up a brand new, isolated environment for each build. It's like a fresh computer that has never heard of your project. No history. No memory. No cached layers.
So even if you changed one line of code, CodeBuild:
- Re-downloads your base image from Docker Hub
- Reinstalls all your dependencies
- Runs every single build step
- Does it all again on the next push
CodeBuild does have a built-in local caching option but it only works if your next build happens to land on the same physical host. That's not something you can count on, especially if your team doesn't push code constantly.
The result? Every build takes 5–6 minutes. Every. Single. Time.
The fix: use Amazon ECR as a persistent cache
Here's the idea in plain English:
Instead of throwing away all those built layers after each run, we save them to Amazon ECR in a separate "cache repository". Then, the next time CodeBuild runs, it pulls those saved layers down first and only rebuilds the parts that actually changed.
Wait, what is Amazon ECR?
Amazon ECR (Elastic Container Registry) is basically AWS's version of Docker Hub, a private place to store your Docker images. If you're already using CodeBuild to build and deploy containers, there's a good chance you're already pushing your app image to ECR.
We're just going to create a second ECR repository specifically to hold our build cache. That's the whole trick.
How it works, step by step
First build (cold start):
- No cache exists yet
- Build everything from scratch (same as before)
- After building, push all the layers to your ECR cache repo
- Time: ~5–6 min (same as before, we're paying the full cost once)
Every build after that:
- Pull the cache from ECR before starting
- Docker finds matching layers → skips them entirely
- Only rebuilds what actually changed (usually just the last 1–2 layers)
- Push the updated cache back to ECR Time: ~1.5–2 min
Real teams have reported build times going from 6 minutes down to 2 minutes that's over 60% faster after the first run.
What you need before starting
You don't need to be an AWS expert for this. Here's the checklist:
- An AWS account
- An existing CodeBuild project (or you're setting one up)
- A Dockerfile in your repo even a basic one
- AWS CLI installed (optional, but handy for the setup commands)
- Basic familiarity with buildspec.yml, knowing it exists is enough
You do NOT need:
- Deep Docker or Kubernetes knowledge
- ECS or Fargate experience
- A paid Docker subscription
Okay, let's build this.
Step-by-step setup
Step 1: Create a dedicated ECR cache repository
You need two ECR repos:
One for your app image (you might already have this)
One just for the cache (this is new)
Why separate? Keeping the cache in its own repo makes it easy to set cleanup rules on it later without touching your actual app images.
# Create your app image repo (skip if you already have one)
aws ecr create-repository --repository-name my-app --region us-east-1
# Create a SEPARATE repo just for the cache
aws ecr create-repository --repository-name my-app-cache --region us-east-1
Naming tip: Call it -cache so you always know what it's for when you're browsing ECR six months from now.
Step 2: Give CodeBuild permission to use ECR
CodeBuild needs permission to both read from and write to your ECR repos. You do this by updating the IAM policy attached to your CodeBuild service role.
Go to IAM → Roles → your CodeBuild service role → Add permissions and paste this policy:
{
"Effect": "Allow",
"Action": [
"ecr:GetAuthorizationToken",
"ecr:BatchCheckLayerAvailability",
"ecr:GetDownloadUrlForLayer",
"ecr:BatchGetImage",
"ecr:InitiateLayerUpload",
"ecr:UploadLayerPart",
"ecr:CompleteLayerUpload",
"ecr:PutImage"
],
"Resource": "*"
}
Heads up: If you get "access denied" errors later, this is the #1 cause. Double-check that ecr:GetAuthorizationToken is in the list, it's easy to miss and CodeBuild needs it to authenticate.
Step 3: Enable Docker BuildKit
Docker BuildKit is the modern Docker build engine. It's what makes the --cache-from and --cache-to flags work, the two flags that are the whole heart of this setup.
You enable it with a single environment variable in your buildspec.yml:
env:
variables:
DOCKER_BUILDKIT: "1"
AWS_ACCOUNT_ID: "123456789012" # ← replace with yours
AWS_DEFAULT_REGION: "us-east-1"
IMAGE_REPO_NAME: "my-app"
CACHE_REPO_NAME: "my-app-cache"
Don't want to hardcode your account ID? Use this instead to fetch it dynamically:
$(aws sts get-caller-identity --query Account --output text)
Step 4: Update your buildspec.yml
This is the main event. Here's the full buildspec.yml with every important line explained:
version: 0.2
env:
variables:
DOCKER_BUILDKIT: "1"
AWS_DEFAULT_REGION: "us-east-1"
IMAGE_REPO_NAME: "my-app"
CACHE_REPO_NAME: "my-app-cache"
phases:
pre_build:
commands:
# Step 1: Log in to ECR so we can pull and push images
- echo Logging in to Amazon ECR...
- aws ecr get-login-password --region $AWS_DEFAULT_REGION | docker login --username AWS --password-stdin $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com
# Step 2: Build the full image URIs we'll use below
- REPO_URI=$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME
- CACHE_URI=$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$CACHE_REPO_NAME
# Step 3: Grab the short commit hash to tag the image with
- COMMIT_HASH=$(echo $CODEBUILD_RESOLVED_SOURCE_VERSION | cut -c 1-7)
build:
commands:
- echo Building Docker image...
- |
docker buildx build \
--cache-from type=registry,ref=$CACHE_URI:cache \
--cache-to type=registry,ref=$CACHE_URI:cache,mode=max \
--tag $REPO_URI:latest \
--tag $REPO_URI:$COMMIT_HASH \
--push \
.
post_build:
commands:
- echo Build complete!
- echo Image pushed to $REPO_URI:$COMMIT_HASH
Step 5: Add a lifecycle policy so ECR doesn't rack up costs
Here's the thing nobody mentions, cache images will keep accumulating in ECR unless you tell AWS to clean them up. Without a lifecycle policy, you could end up paying for gigabytes of stale cache data.
Create a file called lifecycle-policy.json:
{
"rules": [
{
"rulePriority": 1,
"description": "Delete cache images older than 14 days",
"selection": {
"tagStatus": "any",
"countType": "sinceImagePushed",
"countUnit": "days",
"countNumber": 14
},
"action": {
"type": "expire"
}
}
]
}
Apply it to your cache repo:
aws ecr put-lifecycle-policy \
--repository-name my-app-cache \
--lifecycle-policy-text file://lifecycle-policy.json
What does this cost? ECR charges about $0.10 per GB per month. A typical cache image is 200–500 MB. So even with 30 days of builds, you're looking at well under $1/month for ECR storage. The savings on CodeBuild compute time will be way larger.
The secret weapon: write your Dockerfile in the right order
ECR caching helps a lot. But there's one Dockerfile trick that makes it work dramatically better and it takes 30 seconds to apply.
The golden rule: put things that change least at the TOP.
Look at these two Dockerfiles:
# BAD — copies source code BEFORE installing dependencies
FROM node:20-alpine
WORKDIR /app
COPY . . # ← every code change hits here first
RUN npm install # ← so this re-runs EVERY time. Ouch.
CMD ["node", "dist/index.js"]
# GOOD — installs dependencies BEFORE copying source code
FROM node:20-alpine
WORKDIR /app
COPY package.json package-lock.json ./ # ← only changes when deps change
RUN npm install # ← cached most of the time 🎉
COPY . . # ← your code goes here, after deps
RUN npm run build
CMD ["node", "dist/index.js"]
In the bad version, COPY . . comes first so any code change (which is every commit) busts the cache for everything below it, including npm install. You're downloading all your packages from scratch on every single push.
In the good version, npm install happens before COPY . .. So unless you add or remove a dependency, that layer stays cached. Only the "COPY source code" step and below get rebuilt.
This one change alone can cut your build time almost as much as adding ECR caching. Do both for maximum speed.
Things that might go wrong (and how to fix them)
"No cache found" on the first build
This is completely normal! There's nothing in ECR yet. Run the build once, it'll populate the cache. Every build after that will be faster.
"Access Denied" errors from ECR
Almost always an IAM issue. Go back to Step 2 and make sure your CodeBuild service role has ecr:GetAuthorizationToken. That one is easy to miss and CodeBuild will fail silently without it.
"Cache exists but builds aren't getting faster"
Check three things:
Is DOCKER_BUILDKIT=1 set in your environment variables?
Does --cache-from point to the exact same URI as --cache-to?
Is your Dockerfile in the good order from Section 5? (This is the most common culprit.)
"My ECR cache repo is getting huge"
Go to Step 5 and apply the lifecycle policy if you haven't already. You can also switch from mode=max to mode=min, this only caches the final image instead of all intermediate stages, which uses less storage (but gives you slightly fewer cache hits).
You did it!
Let's recap what you built:
- A separate ECR repo to store your Docker layer cache
- IAM permissions so CodeBuild can read and write that cache
- A buildspec.yml that pulls cache before building and pushes it after
- A lifecycle policy so the cache auto-cleans itself
- A properly ordered Dockerfile that maximizes cache hits
Your CI/CD pipeline now has a memory. Every build learns from the last one. And instead of waiting 6 minutes every time you push code, you're waiting 90 seconds.
Drop your numbers in the comments!
I'd genuinely love to know how much time this saves for your team. Before and after build times, what stack you're using, anything.
And if something breaks or doesn't work the way I described, drop a comment and let's debug it together. I check comments on all my posts.
If this helped you, consider sharing it with someone on your team who's been staring at a slow CI pipeline. It might just save their afternoon.



Top comments (0)