DEV Community

Cover image for End-to-End CI/CD on AWS: From Bitbucket to ECS in 10 Steps
Anthony Uketui
Anthony Uketui

Posted on

End-to-End CI/CD on AWS: From Bitbucket to ECS in 10 Steps

TL;DR: A complete, step-by-step guide to building a production-grade CI/CD pipeline on AWS — from source code in Bitbucket through CodeBuild and CodePipeline to a running ECS Fargate service. I built this for a fintech platform running 50+ microservices. Here's everything I learned.


What We're Building

[Bitbucket Repo]
       │
       ▼ (CodeStar Connection)
[AWS CodePipeline]
       │
       ├── Stage 1: Source
       │     └── Fetch code, store as ZIP in S3
       │
       ├── Stage 2: Build (CodeBuild)
       │     ├── Maven build → JAR
       │     ├── Docker build → Image
       │     ├── ECR push → Registry
       │     └── Generate imagedefinitions.json
       │
       └── Stage 3: Deploy (ECS)
             └── Rolling update → Zero downtime
Enter fullscreen mode Exit fullscreen mode

Stack: Java/Spring Boot, Maven, Docker, AWS (CodePipeline, CodeBuild, ECR, ECS Fargate, S3, IAM)


Step 1: Create ECR Repository

ECR stores your Docker images. Every service gets its own repository.

aws ecr create-repository \
  --repository-name my-service \
  --region <YOUR_REGION> \
  --image-scanning-configuration scanOnPush=true
Enter fullscreen mode Exit fullscreen mode

Save the repositoryUri — you'll need it throughout.

Pro tip: Enable scanOnPush=true to automatically scan images for CVEs when they're pushed. Free security.


Step 2: Create S3 Artifact Bucket

CodePipeline needs a bucket to store artifacts between stages.

aws s3 mb s3://codepipeline-<YOUR_REGION>-<YOUR_ACCOUNT_ID> --region <YOUR_REGION>
Enter fullscreen mode Exit fullscreen mode

Block all public access (default). This bucket should never be public.


Step 3: Create IAM Roles

You need three roles. This is where most people get stuck.

3.1: CodePipeline Service Role

Trust policy:

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Principal": {"Service": "codepipeline.amazonaws.com"},
    "Action": "sts:AssumeRole"
  }]
}
Enter fullscreen mode Exit fullscreen mode

Permissions: S3 (read/write artifacts), CodeBuild (start builds), ECS (deploy), CodeStar Connections (Bitbucket access).

3.2: CodeBuild Service Role

Trust policy:

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Principal": {"Service": "codebuild.amazonaws.com"},
    "Action": "sts:AssumeRole"
  }]
}
Enter fullscreen mode Exit fullscreen mode

Permissions: ECR (push images), S3 (read/write artifacts), CloudWatch Logs (build logs), Secrets Manager (if using SonarQube or other secrets).

3.3: ECS Task Execution Role

This is what ECS uses to pull images and write logs:

Permissions: ECR (pull images), CloudWatch Logs (write container logs).

Critical lesson: Follow least-privilege. Don't use * resources in production IAM policies. Scope to specific ARNs.


Step 4: Create the Buildspec

The buildspec.yml lives in your repo root and tells CodeBuild what to do:

version: 0.2

phases:
  pre_build:
    commands:
      - echo Logging in to Amazon ECR...
      - aws ecr get-login-password --region $AWS_DEFAULT_REGION | docker login --username AWS --password-stdin $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com

  build:
    commands:
      - echo Build started on `date`
      - docker build -t $IMAGE_REPO_NAME:$IMAGE_TAG .
      - docker tag $IMAGE_REPO_NAME:$IMAGE_TAG $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG

  post_build:
    commands:
      - echo Build completed on `date`
      - docker push $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG
      - printf '[{"name":"my-container","imageUri":"%s"}]' $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG > imagedefinitions.json

artifacts:
  files:
    - imagedefinitions.json
Enter fullscreen mode Exit fullscreen mode

Key environment variables (set in CodeBuild project):

  • AWS_ACCOUNT_ID: Your 12-digit account number
  • AWS_DEFAULT_REGION: Target region
  • IMAGE_REPO_NAME: Must match ECR repository name exactly
  • IMAGE_TAG: Usually latest for staging, commit SHA for production

Step 5: Write the Dockerfile

For Java/Spring Boot services, use a multi-stage build:

# Stage 1: Build
FROM maven:3.9-amazoncorretto-17 AS build
WORKDIR /app
COPY pom.xml .
COPY src ./src
RUN mvn clean package -DskipTests

# Stage 2: Runtime
FROM amazoncorretto:17-alpine
WORKDIR /app
COPY --from=build /app/target/*.jar app.jar
EXPOSE 8080
ENTRYPOINT ["java", "-jar", "app.jar"]
Enter fullscreen mode Exit fullscreen mode

Critical lessons from production:

  1. Java version must match pom.xml. Using a Java 8 base image with a Spring Boot 3.x project will silently fail.

  2. Multi-module builds need the root pom. If your service depends on internal modules, copy ALL module directories and build from root.

  3. Pin your base images. amazoncorretto:17-alpine not amazoncorretto:latest. Unpinned tags cause "works on my machine" bugs.

  4. Platform matters. If you build on an M1/M2 Mac, the image is ARM. ECS Fargate (usually x86) will crash. Always use --platform linux/amd64 in your build.


Step 6: Create ECS Task Definition

aws ecs register-task-definition \
  --family my-service-task \
  --network-mode awsvpc \
  --requires-compatibilities FARGATE \
  --cpu 512 \
  --memory 1024 \
  --execution-role-arn arn:aws:iam::<ACCOUNT_ID>:role/ecsTaskExecutionRole \
  --container-definitions '[{
    "name": "my-container",
    "image": "<ACCOUNT_ID>.dkr.ecr.<REGION>.amazonaws.com/my-service:latest",
    "portMappings": [{"containerPort": 8080, "protocol": "tcp"}],
    "essential": true,
    "logConfiguration": {
      "logDriver": "awslogs",
      "options": {
        "awslogs-group": "/ecs/my-service",
        "awslogs-region": "<REGION>",
        "awslogs-stream-prefix": "ecs"
      }
    }
  }]'
Enter fullscreen mode Exit fullscreen mode

Step 7: Create ECS Service

aws ecs create-service \
  --cluster my-cluster \
  --service-name my-service \
  --task-definition my-service-task \
  --desired-count 1 \
  --launch-type FARGATE \
  --network-configuration "awsvpcConfiguration={subnets=[<SUBNET_1>,<SUBNET_2>],securityGroups=[<SG_ID>],assignPublicIp=ENABLED}"
Enter fullscreen mode Exit fullscreen mode

Step 8: Create CodeBuild Project

Configure in the Console or CLI:

  • Source: Bitbucket (via CodeStar Connection)
  • Environment: Amazon Linux 2023, Standard image
  • Buildspec: Use the buildspec.yml from the repo
  • Service role: The CodeBuild role from Step 3

Step 9: Create CodePipeline

Wire everything together:

  1. Source stage: Bitbucket → CodeStar Connection → triggers on push to main
  2. Build stage: CodeBuild project from Step 8
  3. Deploy stage: Amazon ECS → select your cluster and service

Step 10: Deployment Strategy — Rolling Updates

ECS uses rolling updates by default:

  1. ECS spins up new containers with the new image
  2. New containers must pass ALB health checks
  3. Once healthy, the load balancer drains traffic from old containers
  4. Old containers are terminated

Zero downtime. If the new containers fail health checks, the old ones keep running.

How to Rollback

If something goes wrong:

  1. Go to ECS Service → Update Service
  2. Select the previous Task Definition revision
  3. Check "Force new deployment"
  4. ECS reverts to the last known-good version

Common Failure Checklist

Symptom Likely Cause Fix
Pipeline stuck at Source Bitbucket connection expired Re-authorize in Settings → Connections
Build fails (Maven) Dependency or syntax error Check CodeBuild logs
Build fails (Docker) Java version mismatch or missing modules Match pom.xml Java version to Dockerfile base image
Deploy fails (ECS) Health check failing Verify ALB target group health path matches your app's health endpoint
Container crash loops Missing env vars or insufficient memory Check CloudWatch/New Relic logs, increase CPU/memory
Image not found ECR repo name mismatch Verify IMAGE_REPO_NAME matches exactly

Staging vs. Production

Aspect Staging Production
Region us-east-2 (Ohio) eu-west-2 (London)
Trigger Auto-deploy on merge to staging branch Manual release from console
Quality Gate SonarQube enforced SonarQube enforced
Rollback Auto (ECS) Manual + stakeholder coordination

What I'd Do Differently

  1. Start with the IAM roles. Every pipeline failure I debugged in the first month was a permissions issue. Get IAM right first.

  2. Pin container images by digest, not tag. :latest is convenient but non-deterministic. For production, use @sha256:... to guarantee you're deploying exactly what you tested.

  3. Add an approval stage for production. A manual approval gate between staging and production prevents accidental deploys.

  4. Integrate SAST from day one. Bolting on SonarQube later means retrofitting 30+ buildspecs. Build it into the template from the start.


This pipeline runs 50+ microservices for a fintech payment gateway. If you're building CI/CD on AWS, especially in regulated industries, feel free to connect — always happy to compare approaches.

Top comments (0)