My last project was a CI/CD pipeline with blue/green deployments. It taught me CodeDeploy, CodePipeline, and a lot about IAM. But it ran on EC2 instances in a default VPC, no custom networking, no containers, no database tier.
This time I wanted to build what companies actually run in production: a 3-tier architecture with proper network isolation, serverless containers, a managed database, and an in-memory cache. All codified in Terraform modules.
What I Built
A Node.js API running on ECS Fargate that talks to PostgreSQL (RDS) and Redis (ElastiCache), deployed inside a custom VPC with public and private subnets:
Internet → ALB (public subnets)
↓ :3000
ECS Fargate (private subnets)
Bun + Express API
↓ :5432 ↓ :6379
RDS PostgreSQL ElastiCache Redis
(private subnets) (private subnets)
The ALB is the only thing exposed to the internet. ECS, RDS, and Redis all sit in private subnets with no public IP addresses. Each tier's security group only allows traffic from the tier above it. The entire infrastructure is defined in 6 Terraform modules — 37 resources created with one command.
Why This Architecture Matters
If you're interviewing for DevOps or cloud engineering roles, "I deployed an app to EC2" doesn't differentiate you. Interviewers want to know:
- Can you design a VPC from scratch with proper subnet segmentation?
- Do you understand why databases belong in private subnets?
- Can you explain the difference between an Internet Gateway and a NAT Gateway?
- Have you actually worked with ECS Fargate, not just read about it?
This project answers all of those with working code.
The Network Layer
This was the foundation everything else depended on. I created a VPC with 10.0.0.0/16 split across two availability zones:
| Subnet | CIDR | Tier | Internet Access |
|---|---|---|---|
| Public 1a | 10.0.0.0/20 | ALB, NAT Gateway | Direct via IGW |
| Public 1b | 10.0.16.0/20 | ALB (multi-AZ) | Direct via IGW |
| Private 1a | 10.0.32.0/20 | ECS, RDS | Outbound only via NAT |
| Private 1b | 10.0.48.0/20 | ECS, ElastiCache | Outbound only via NAT |
The key design decision: everything except the ALB goes in private subnets. The ECS tasks need outbound internet access (to pull images from ECR), so they route through a NAT Gateway in the public subnet. But nothing on the internet can reach them directly.
Each Terraform module is self-contained. The VPC module outputs subnet IDs and the VPC ID. Other modules consume those outputs without knowing anything about how the network is built.
Security Group Boundaries
This is the part that makes this a real 3-tier architecture, not just "three things in the same VPC." Each tier has its own security group, and the rules enforce strict boundaries:
| Security Group | Allows Inbound | From |
|---|---|---|
| alb-sg | TCP 80 | 0.0.0.0/0 (the internet) |
| ecs-sg | TCP 3000 | alb-sg only |
| rds-sg | TCP 5432 | ecs-sg only |
| redis-sg | TCP 6379 | ecs-sg only |
No security group references a CIDR block except the ALB. Everything else references another security group. This means if an ECS task gets compromised, it can only reach the database and cache, not the internet, not other subnets, not other services.
This is how production environments are designed, and explaining it in an interview immediately signals you understand network security beyond "I opened port 22."
ECS Fargate(Containers Without Servers)
I used Fargate instead of EC2 for the compute layer. No instances to patch, no AMIs to maintain, no Auto Scaling Groups to configure. You define a task (CPU, memory, container image, environment variables) and Fargate runs it.
The task definition connects the app to both RDS and Redis through environment variables:
DB_HOST → RDS endpoint (injected by Terraform)
DB_PASSWORD → Secrets Manager ARN (resolved at task launch by ECS)
REDIS_HOST → ElastiCache endpoint (injected by Terraform)
The database password never touches Terraform state as plaintext and never appears in environment variable logs. ECS resolves it from Secrets Manager at runtime using the task execution role's IAM permissions.
One thing I enabled that's worth mentioning: the deployment circuit breaker with rollback. If a new task definition fails to start (bad image, crash loop, health check failure), ECS automatically stops the deployment and rolls back to the last working version. Same concept as the CodeDeploy auto-rollback from my first project, but built into ECS.
The Application(Proving the Architecture Works)
I built a fresh Express API specifically designed to demonstrate all three tiers working together. The key endpoint is GET /items:
First request: queries PostgreSQL, caches the result in Redis for 30 seconds, returns "source": "database".
Second request (within 30s): returns the cached data from Redis, "source": "cache", with 1ms latency.
Any write operation (POST, PUT, DELETE) invalidates the Redis cache so the next read gets fresh data from PostgreSQL. This is a standard cache-aside pattern used in production systems.
The /health endpoint checks both database and cache connectivity. If either is down, it returns a 503, which the ALB detects and stops routing traffic to that task.
{
"status": "healthy",
"services": {
"database": { "connected": true, "time": "2026-03-09T14:50:37.136Z" },
"cache": { "connected": true, "latency": 1 }
}
}
Terraform Modules(Reusable Infrastructure)
Instead of one giant Terraform file, I split everything into 6 modules:
modules/
├── vpc/ # Network foundation
├── security-groups/ # Tier boundaries
├── alb/ # Load balancing
├── ecs/ # Container orchestration
├── rds/ # Database
└── elasticache/ # Caching
Each module has its own variables.tf, main.tf, and outputs.tf. The root main.tf wires them together:
module "ecs" {
source = "./modules/ecs"
private_subnet_ids = module.vpc.private_subnet_ids
security_group_id = module.security_groups.ecs_sg_id
target_group_arn = module.alb.target_group_arn
db_host = module.rds.endpoint
redis_host = module.elasticache.endpoint
db_secret_arn = module.rds.secret_arn
container_image = "${aws_ecr_repository.app.repository_url}:latest"
}
The advantage of modules: you can reuse the VPC module for a completely different project, or create dev/staging/prod environments by calling the same modules with different variables. That's the next iteration.
The Problems I Hit
exec format error
I built the Docker image on my Mac (Apple Silicon = ARM) and pushed it to ECR. Fargate runs x86_64. The container started and immediately crashed with exec format error, no other context.
The fix: docker build --platform linux/amd64. Always specify the platform when building for Fargate.
no pg_hba.conf entry
RDS PostgreSQL requires SSL by default. My app was connecting without it. The error message is a PostgreSQL internals reference that doesn't mention SSL at all.
The fix: add ssl: { rejectUnauthorized: false } to the connection pool config.
CannotPullContainerError
I deployed the ECS service before pushing the Docker image to ECR. Fargate couldn't find the image, retried 7 times, and tripped the circuit breaker. After pushing the correct image, new deployments still failed because the breaker was already tripped.
The fix: aws ecs update-service --force-new-deployment resets the circuit breaker and triggers a fresh deployment.
Target type: ip vs instance
Fargate requires target_type = "ip" on the ALB target group. EC2-based services use "instance". Using the wrong one causes silent registration failures where ECS reports the task as running but the ALB never sees it.
Cost Breakdown
For anyone worried about the AWS bill:
| Resource | Monthly Cost |
|---|---|
| NAT Gateway | ~$32 |
| ALB | ~$16 |
| ElastiCache | ~$12 |
| ECS Fargate | ~$9 |
| RDS db.t3.micro | Free tier |
| ECR + Secrets Manager | Minimal |
| Total | ~$70/month |
The NAT Gateway is the biggest surprise, it's more expensive than the ALB. In production you'd need it, but for learning, terraform destroy when you're not working saves real money.
What I'd Do Differently
Add HTTPS from the start. ACM + Route53 would make this production-ready. HTTP-only is fine for a demo but wouldn't pass a security review.
Use Terraform workspaces for multi-environment. Right now it's a single environment. The module structure supports dev/staging/prod, just pass different variables. That's the next iteration.
Auto Scaling for ECS. One task is fine for a demo, but production needs scaling policies based on CPU and request count.
CI/CD integration. This project deploys manually with docker push and ecs update-service. Connecting it to CodePipeline (from Project 1) would complete the picture.
What This Proves on a Resume
This project covers territory that most junior/mid-level candidates don't demonstrate:
- Custom VPC design: with proper public/private subnet segmentation
- ECS Fargate: serverless containers, not just EC2
- Multi-tier security: security groups referencing other security groups, not CIDRs
- Managed data services: RDS + ElastiCache with proper secret handling
- Terraform modules: reusable, composable infrastructure, not flat files
- Real debugging: ARM vs x86, SSL requirements, circuit breakers, NAT Gateway necessity
If an interviewer asks "tell me about a complex AWS architecture you've built," this project gives you 20 minutes of material.
Links
- GitHub repo: three-tier-aws (Terraform + application code)
- AWS Pipeline (CI/CD Pipeline): Part 1 | Part 2
- Portfolio: augusthottie.com
Building my DevOps portfolio ahead of the AWS DevOps Professional certification. Connect with me on LinkedIn, I'd love to hear what you're building.

Top comments (0)