Infrastructure as Code (IaC) isn't just for enterprise teams. As a startup, you can build production-ready AWS infrastructure in a weekend using Terraform's reusable modules. This pragmatic approach helps you scale fast while maintaining reliability and cost control.
Table of Contents
- Why Startups Need IaC From Day One
- Weekend Roadmap
- Day 1: Building the Foundation
- Day 2: Application Infrastructure
- CloudWatch Monitoring and Alerts
- Deployment Commands
- Cost Optimization Tips
- Security Best Practices
- Production Considerations
- Conclusion
Why Startups Need IaC From Day One
Many startups delay infrastructure automation, thinking it's premature optimization. This is a costly mistake. IaC provides:
- Reproducible environments - Dev, staging, and prod are identical
- Version control - Infrastructure changes are tracked and reviewable
- Cost optimization - Resources are defined explicitly, preventing drift
- Team scaling - New developers can spin up environments instantly
- Disaster recovery - Rebuild your entire stack with one command
Weekend Roadmap
Day 1: Foundation & VPC
- Set up Terraform workspace
- Build networking foundation
- Configure security groups and NACLs
Day 2: Application Infrastructure
- Deploy ECS Fargate cluster
- Set up RDS database
- Configure monitoring and alerts
Day 1: Building the Foundation
Project Structure
Start with a clean, modular structure:
terraform/
├── environments/
│ ├── dev/
│ ├── staging/
│ └── prod/
├── modules/
│ ├── vpc/
│ ├── ecs/
│ └── rds/
├── shared/
│ └── variables.tf
└── README.md
VPC Module (modules/vpc/main.tf
)
variable "environment" {
description = "Environment name"
type = string
}
variable "vpc_cidr" {
description = "CIDR block for VPC"
type = string
default = "10.0.0.0/16"
}
variable "availability_zones" {
description = "Availability zones"
type = list(string)
default = ["us-east-1a", "us-east-1b"]
}
# VPC
resource "aws_vpc" "main" {
cidr_block = var.vpc_cidr
enable_dns_hostnames = true
enable_dns_support = true
tags = {
Name = "${var.environment}-vpc"
Environment = var.environment
}
}
# Internet Gateway
resource "aws_internet_gateway" "main" {
vpc_id = aws_vpc.main.id
tags = {
Name = "${var.environment}-igw"
Environment = var.environment
}
}
# Public Subnets
resource "aws_subnet" "public" {
count = length(var.availability_zones)
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(var.vpc_cidr, 8, count.index)
availability_zone = var.availability_zones[count.index]
map_public_ip_on_launch = true
tags = {
Name = "${var.environment}-public-${count.index + 1}"
Environment = var.environment
Type = "Public"
}
}
# Private Subnets
resource "aws_subnet" "private" {
count = length(var.availability_zones)
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(var.vpc_cidr, 8, count.index + 100)
availability_zone = var.availability_zones[count.index]
tags = {
Name = "${var.environment}-private-${count.index + 1}"
Environment = var.environment
Type = "Private"
}
}
# NAT Gateways
resource "aws_eip" "nat" {
count = length(aws_subnet.public)
domain = "vpc"
depends_on = [aws_internet_gateway.main]
tags = {
Name = "${var.environment}-nat-eip-${count.index + 1}"
Environment = var.environment
}
}
resource "aws_nat_gateway" "main" {
count = length(aws_subnet.public)
allocation_id = aws_eip.nat[count.index].id
subnet_id = aws_subnet.public[count.index].id
tags = {
Name = "${var.environment}-nat-${count.index + 1}"
Environment = var.environment
}
depends_on = [aws_internet_gateway.main]
}
# Route Tables
resource "aws_route_table" "public" {
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.main.id
}
tags = {
Name = "${var.environment}-public-rt"
Environment = var.environment
}
}
resource "aws_route_table" "private" {
count = length(aws_nat_gateway.main)
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
nat_gateway_id = aws_nat_gateway.main[count.index].id
}
tags = {
Name = "${var.environment}-private-rt-${count.index + 1}"
Environment = var.environment
}
}
# Route Table Associations
resource "aws_route_table_association" "public" {
count = length(aws_subnet.public)
subnet_id = aws_subnet.public[count.index].id
route_table_id = aws_route_table.public.id
}
resource "aws_route_table_association" "private" {
count = length(aws_subnet.private)
subnet_id = aws_subnet.private[count.index].id
route_table_id = aws_route_table.private[count.index].id
}
# Security Group for ALB
resource "aws_security_group" "alb" {
name_prefix = "${var.environment}-alb-"
vpc_id = aws_vpc.main.id
ingress {
description = "HTTP"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
description = "HTTPS"
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "${var.environment}-alb-sg"
Environment = var.environment
}
lifecycle {
create_before_destroy = true
}
}
# Security Group for ECS
resource "aws_security_group" "ecs" {
name_prefix = "${var.environment}-ecs-"
vpc_id = aws_vpc.main.id
ingress {
description = "HTTP from ALB"
from_port = 80
to_port = 80
protocol = "tcp"
security_groups = [aws_security_group.alb.id]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "${var.environment}-ecs-sg"
Environment = var.environment
}
lifecycle {
create_before_destroy = true
}
}
# Outputs
output "vpc_id" {
description = "ID of the VPC"
value = aws_vpc.main.id
}
output "public_subnet_ids" {
description = "IDs of the public subnets"
value = aws_subnet.public[*].id
}
output "private_subnet_ids" {
description = "IDs of the private subnets"
value = aws_subnet.private[*].id
}
output "alb_security_group_id" {
description = "ID of the ALB security group"
value = aws_security_group.alb.id
}
output "ecs_security_group_id" {
description = "ID of the ECS security group"
value = aws_security_group.ecs.id
}
Development Environment (environments/dev/main.tf
)
terraform {
required_version = ">= 1.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
provider "aws" {
region = var.aws_region
}
# Variables
variable "aws_region" {
description = "AWS region"
type = string
default = "us-east-1"
}
variable "environment" {
description = "Environment name"
type = string
default = "dev"
}
# VPC Module
module "vpc" {
source = "../../modules/vpc"
environment = var.environment
vpc_cidr = "10.0.0.0/16"
availability_zones = ["us-east-1a", "us-east-1b"]
}
# Outputs
output "vpc_id" {
value = module.vpc.vpc_id
}
Day 2: Application Infrastructure
ECS Fargate Module (modules/ecs/main.tf
)
variable "environment" {
description = "Environment name"
type = string
}
variable "vpc_id" {
description = "VPC ID"
type = string
}
variable "private_subnet_ids" {
description = "Private subnet IDs"
type = list(string)
}
variable "public_subnet_ids" {
description = "Public subnet IDs"
type = list(string)
}
variable "ecs_security_group_id" {
description = "ECS security group ID"
type = string
}
variable "alb_security_group_id" {
description = "ALB security group ID"
type = string
}
variable "app_name" {
description = "Application name"
type = string
default = "webapp"
}
variable "app_port" {
description = "Application port"
type = number
default = 3000
}
variable "desired_count" {
description = "Desired number of tasks"
type = number
default = 2
}
variable "cpu" {
description = "CPU units"
type = number
default = 256
}
variable "memory" {
description = "Memory in MB"
type = number
default = 512
}
# ECS Cluster
resource "aws_ecs_cluster" "main" {
name = "${var.environment}-cluster"
setting {
name = "containerInsights"
value = "enabled"
}
tags = {
Name = "${var.environment}-cluster"
Environment = var.environment
}
}
# ECS Task Definition
resource "aws_ecs_task_definition" "app" {
family = "${var.environment}-${var.app_name}"
execution_role_arn = aws_iam_role.ecs_task_execution_role.arn
task_role_arn = aws_iam_role.ecs_task_role.arn
network_mode = "awsvpc"
requires_compatibilities = ["FARGATE"]
cpu = var.cpu
memory = var.memory
container_definitions = jsonencode([
{
name = var.app_name
image = "nginx:latest" # Replace with your app image
portMappings = [
{
containerPort = var.app_port
protocol = "tcp"
}
]
logConfiguration = {
logDriver = "awslogs"
options = {
awslogs-group = aws_cloudwatch_log_group.app.name
awslogs-region = data.aws_region.current.name
awslogs-stream-prefix = "ecs"
}
}
environment = [
{
name = "ENVIRONMENT"
value = var.environment
}
]
}
])
tags = {
Name = "${var.environment}-${var.app_name}"
Environment = var.environment
}
}
# Application Load Balancer
resource "aws_lb" "main" {
name = "${var.environment}-alb"
internal = false
load_balancer_type = "application"
security_groups = [var.alb_security_group_id]
subnets = var.public_subnet_ids
enable_deletion_protection = false
tags = {
Name = "${var.environment}-alb"
Environment = var.environment
}
}
resource "aws_lb_target_group" "app" {
name = "${var.environment}-${var.app_name}-tg"
port = var.app_port
protocol = "HTTP"
vpc_id = var.vpc_id
target_type = "ip"
health_check {
enabled = true
healthy_threshold = "3"
interval = "30"
matcher = "200"
path = "/"
port = "traffic-port"
protocol = "HTTP"
timeout = "5"
unhealthy_threshold = "2"
}
tags = {
Name = "${var.environment}-${var.app_name}-tg"
Environment = var.environment
}
}
resource "aws_lb_listener" "front_end" {
load_balancer_arn = aws_lb.main.arn
port = "80"
protocol = "HTTP"
default_action {
type = "forward"
target_group_arn = aws_lb_target_group.app.arn
}
}
# ECS Service
resource "aws_ecs_service" "main" {
name = "${var.environment}-${var.app_name}"
cluster = aws_ecs_cluster.main.id
task_definition = aws_ecs_task_definition.app.arn
desired_count = var.desired_count
launch_type = "FARGATE"
network_configuration {
security_groups = [var.ecs_security_group_id]
subnets = var.private_subnet_ids
assign_public_ip = false
}
load_balancer {
target_group_arn = aws_lb_target_group.app.arn
container_name = var.app_name
container_port = var.app_port
}
depends_on = [aws_lb_listener.front_end]
tags = {
Name = "${var.environment}-${var.app_name}"
Environment = var.environment
}
}
# CloudWatch Log Group
resource "aws_cloudwatch_log_group" "app" {
name = "/ecs/${var.environment}/${var.app_name}"
retention_in_days = 30
tags = {
Name = "${var.environment}-${var.app_name}-logs"
Environment = var.environment
}
}
# IAM Roles
resource "aws_iam_role" "ecs_task_execution_role" {
name = "${var.environment}-ecsTaskExecutionRole"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "ecs-tasks.amazonaws.com"
}
}
]
})
tags = {
Name = "${var.environment}-ecsTaskExecutionRole"
Environment = var.environment
}
}
resource "aws_iam_role_policy_attachment" "ecs_task_execution_role" {
role = aws_iam_role.ecs_task_execution_role.name
policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy"
}
resource "aws_iam_role" "ecs_task_role" {
name = "${var.environment}-ecsTaskRole"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "ecs-tasks.amazonaws.com"
}
}
]
})
tags = {
Name = "${var.environment}-ecsTaskRole"
Environment = var.environment
}
}
# Data Sources
data "aws_region" "current" {}
# Outputs
output "cluster_id" {
description = "ECS cluster ID"
value = aws_ecs_cluster.main.id
}
output "alb_dns_name" {
description = "ALB DNS name"
value = aws_lb.main.dns_name
}
output "alb_zone_id" {
description = "ALB zone ID"
value = aws_lb.main.zone_id
}
RDS Module (modules/rds/main.tf
)
variable "environment" {
description = "Environment name"
type = string
}
variable "vpc_id" {
description = "VPC ID"
type = string
}
variable "private_subnet_ids" {
description = "Private subnet IDs"
type = list(string)
}
variable "allowed_security_groups" {
description = "Security groups allowed to access RDS"
type = list(string)
default = []
}
variable "db_name" {
description = "Database name"
type = string
default = "appdb"
}
variable "db_username" {
description = "Database username"
type = string
default = "dbadmin"
}
variable "db_password" {
description = "Database password"
type = string
sensitive = true
}
variable "instance_class" {
description = "RDS instance class"
type = string
default = "db.t3.micro"
}
variable "allocated_storage" {
description = "Allocated storage in GB"
type = number
default = 20
}
variable "backup_retention_period" {
description = "Backup retention period in days"
type = number
default = 7
}
# Security Group for RDS
resource "aws_security_group" "rds" {
name_prefix = "${var.environment}-rds-"
vpc_id = var.vpc_id
ingress {
description = "MySQL/Aurora"
from_port = 3306
to_port = 3306
protocol = "tcp"
security_groups = var.allowed_security_groups
}
tags = {
Name = "${var.environment}-rds-sg"
Environment = var.environment
}
lifecycle {
create_before_destroy = true
}
}
# DB Subnet Group
resource "aws_db_subnet_group" "default" {
name = "${var.environment}-db-subnet-group"
subnet_ids = var.private_subnet_ids
tags = {
Name = "${var.environment}-db-subnet-group"
Environment = var.environment
}
}
# RDS Instance
resource "aws_db_instance" "default" {
identifier = "${var.environment}-database"
engine = "mysql"
engine_version = "8.0"
instance_class = var.instance_class
allocated_storage = var.allocated_storage
max_allocated_storage = var.allocated_storage * 2
db_name = var.db_name
username = var.db_username
password = var.db_password
vpc_security_group_ids = [aws_security_group.rds.id]
db_subnet_group_name = aws_db_subnet_group.default.name
backup_retention_period = var.backup_retention_period
backup_window = "03:00-04:00"
maintenance_window = "sun:04:00-sun:05:00"
skip_final_snapshot = true
deletion_protection = false
performance_insights_enabled = false
monitoring_interval = 0
tags = {
Name = "${var.environment}-database"
Environment = var.environment
}
}
# Outputs
output "rds_hostname" {
description = "RDS instance hostname"
value = aws_db_instance.default.address
sensitive = true
}
output "rds_port" {
description = "RDS instance port"
value = aws_db_instance.default.port
}
output "rds_username" {
description = "RDS instance root username"
value = aws_db_instance.default.username
sensitive = true
}
Complete Environment Configuration
Update environments/dev/main.tf
:
# Add to existing dev environment
module "ecs" {
source = "../../modules/ecs"
environment = var.environment
vpc_id = module.vpc.vpc_id
private_subnet_ids = module.vpc.private_subnet_ids
public_subnet_ids = module.vpc.public_subnet_ids
ecs_security_group_id = module.vpc.ecs_security_group_id
alb_security_group_id = module.vpc.alb_security_group_id
app_name = "myapp"
desired_count = 1 # Lower for dev
cpu = 256
memory = 512
}
module "rds" {
source = "../../modules/rds"
environment = var.environment
vpc_id = module.vpc.vpc_id
private_subnet_ids = module.vpc.private_subnet_ids
allowed_security_groups = [module.vpc.ecs_security_group_id]
db_password = var.db_password
instance_class = "db.t3.micro"
backup_retention_period = 1 # Minimal for dev
}
variable "db_password" {
description = "Database password"
type = string
sensitive = true
}
output "alb_dns_name" {
value = module.ecs.alb_dns_name
}
CloudWatch Monitoring and Alerts
Add monitoring to your modules:
# CloudWatch Alarms for ECS
resource "aws_cloudwatch_metric_alarm" "high_cpu" {
alarm_name = "${var.environment}-high-cpu"
comparison_operator = "GreaterThanThreshold"
evaluation_periods = "2"
metric_name = "CPUUtilization"
namespace = "AWS/ECS"
period = "120"
statistic = "Average"
threshold = "80"
alarm_description = "This metric monitors ecs cpu utilization"
dimensions = {
ServiceName = aws_ecs_service.main.name
ClusterName = aws_ecs_cluster.main.name
}
tags = {
Name = "${var.environment}-high-cpu-alarm"
Environment = var.environment
}
}
# SNS Topic for Alerts
resource "aws_sns_topic" "alerts" {
name = "${var.environment}-alerts"
tags = {
Name = "${var.environment}-alerts"
Environment = var.environment
}
}
Deployment Commands
Deploy your infrastructure:
# Development
cd environments/dev
terraform init
terraform plan -var="db_password=your-secure-password"
terraform apply -var="db_password=your-secure-password"
# Production (copy dev to prod with appropriate sizing)
cd ../prod
terraform init
terraform plan -var="db_password=your-secure-password"
terraform apply -var="db_password=your-secure-password"
Cost Optimization Tips
Right-Sizing Resources
- Dev: t3.micro instances, minimal RDS
- Prod: Start small and scale based on metrics
- Use Fargate Spot for non-critical workloads
Resource Scheduling
# Auto-scaling for ECS
resource "aws_appautoscaling_target" "ecs_target" {
max_capacity = 10
min_capacity = 2
resource_id = "service/${aws_ecs_cluster.main.name}/${aws_ecs_service.main.name}"
scalable_dimension = "ecs:service:DesiredCount"
service_namespace = "ecs"
}
resource "aws_appautoscaling_policy" "scale_up" {
name = "${var.environment}-scale-up"
policy_type = "TargetTrackingScaling"
resource_id = aws_appautoscaling_target.ecs_target.resource_id
scalable_dimension = aws_appautoscaling_target.ecs_target.scalable_dimension
service_namespace = aws_appautoscaling_target.ecs_target.service_namespace
target_tracking_scaling_policy_configuration {
predefined_metric_specification {
predefined_metric_type = "ECSServiceAverageCPUUtilization"
}
target_value = 70.0
}
}
Security Best Practices
Secrets Management
# Use AWS Secrets Manager for sensitive data
resource "aws_secretsmanager_secret" "db_password" {
name = "${var.environment}/database/password"
}
resource "aws_secretsmanager_secret_version" "db_password" {
secret_id = aws_secretsmanager_secret.db_password.id
secret_string = var.db_password
}
Network Security
- All databases in private subnets
- Security groups with minimal access
- VPC Flow Logs for network monitoring
- WAF for public-facing applications
Production Considerations
State Management
Use remote state with S3 and DynamoDB locking:
terraform {
backend "s3" {
bucket = "your-terraform-state-bucket"
key = "environments/prod/terraform.tfstate"
region = "us-east-1"
encrypt = true
dynamodb_table = "terraform-state-lock"
}
}
Multi-Environment Strategy
- Separate AWS accounts for prod
- Environment-specific variable files
- Automated testing of Terraform changes
- GitOps workflow with pull request reviews
Conclusion
You now have production-ready AWS infrastructure that scales with your startup growth. The modular Terraform approach provides enterprise-grade reliability while remaining startup-friendly in complexity and cost.
This infrastructure foundation delivers:
- High availability across multiple AZs with automatic failover
- Scalable containerized applications using ECS Fargate
- Managed database services with automated backups and maintenance
- Comprehensive monitoring with CloudWatch metrics and alerts
- Cost optimization through right-sizing and intelligent auto-scaling
The weekend time investment in infrastructure automation creates a solid foundation that supports rapid scaling as your startup grows, while maintaining the operational reliability your customers expect.
Need help building production-ready AWS infrastructure for your startup? I specialize in Terraform consulting and can set up scalable, cost-optimized cloud infrastructure that grows with your business. Check out my DevOps services and portfolio or contact me directly to discuss your infrastructure needs.
This is part 2 of my "DevOps for Startups" series. Part 1 covered automated React deployment pipelines with GitHub Actions and AWS
Top comments (0)