Enhancing AWS CodePipeline for Real-Time Batch Processing: Lessons from My Experience

#aws #cicd #devops #learning

Working with AWS in real-world projects is always a learning experience. Recently, I worked on enhancing an AWS CodePipeline for a real-time batch processing system, and it threw several challenges at me and i almost stuck 2 sprints in this. Here’s what happened, the mistakes I made, and how I solved them all in my personal experience.

The Initial Setup

Initially, I had:

10 transaction batch processes with cron for auto running, each in its own folder.
Each folder had a separate Dockerfile because of different dependencies.

At first, this seemed clean everything was isolated and modular. But reality hit when I tried to deploy the pipeline.

Mistake 1: Multiple Dockerfiles → Huge Artifacts

What I did: Maintained 10 separate Dockerfiles.
Problem: CodePipeline packaged each Dockerfile as a separate artifact. The total size became *too large * around 515 MB, and stack creation failed with:

Artifact size limit exceeded

My Fix:
- Centralized all batch processes into one folder with a single Dockerfile.
- Moved cron tasks into a separate file.
- Used Promise.all to run multiple async functions in parallel, so all batch processes executed efficiently without blocking each other.

This single Docker image drastically reduced artifact size and simplified the pipeline.

Mistake 2: IAM Policy Limits with 10 DynamoDB Tables

What I did: Tried to give full access to all 10 DynamoDB tables in one IAM policy.
Problem: Hit the PolicySize quota limit (6144 bytes), which caused access denied errors.
My Fix:
- Categorized tables using labels.
- In my code, used a for loop and case-check: only tables allowed under a label were processed.
- This approach kept IAM policies small and ensured proper access control.

Mistake 3: S3 Artifacts Management

What I did: Stored all pipeline artifacts in S3 without proper cleanup.
Problem: Over time, old artifacts accumulated, occasionally causing pipeline failures.
My Fix:
- Enabled versioning in the S3 bucket.
- Added lifecycle rules to automatically clean up old artifacts.

This ensured the pipeline stayed reliable over time.

AWS Services I Worked With

To enhance the pipeline, I used:

CodePipeline → CI/CD orchestration
CodeBuild → Build Docker image and run batch processes
ECR (Elastic Container Registry) → Store Docker image
ECS (Fargate) → Run batch tasks
EventBridge → Schedule automated execution
CloudWatch Metrics & Logs → Monitor execution and errors
DynamoDB → Store transaction data
S3 → Store pipeline artifacts

Lessons Learned

Centralize Docker builds: Avoid multiple images for similar tasks; it simplifies deployment.
Async execution works wonders: Using Promise.all made multiple batch tasks run efficiently without blocking.
Mind IAM policy size limits: Categorizing resources by label and programmatically checking access avoids quota issues.
Manage S3 artifacts: Versioning and lifecycle policies prevent old artifacts from breaking pipelines.
Mistakes are learning opportunities: Every failure helped me optimize the pipeline and understand AWS limits better.

Enhancing this pipeline taught me practical AWS CI/CD skills, including ECS scheduling, IAM optimization, and real-time batch execution. The mistakes I made multiple Dockerfiles, oversized IAM policies, and unmanaged S3 artifacts were frustrating at first, but solving them made the pipeline robust and maintainable.

what interesting challenges you faced? Command below.

Top comments (1)

Suvrajeet Banerjee • Sep 21

Expands on deployment chats' pipelines, but error-focused—valuable for real-time batches. ⚙️