AWS VPC design for multi-project multi-account setups: patterns that scale
VPC decisions made on day 1 are the hardest to change. Here's the architecture that scales.
Core principle: non-overlapping CIDRs from day one
management-account 10.0.0.0/16
prod-account-1 10.1.0.0/16
staging-account 10.10.0.0/16
dev-account 10.20.0.0/16
Within prod-account-1:
project-alpha: 10.1.0.0/20 (4,096 IPs)
project-beta: 10.1.16.0/20
Terraform VPC module
locals {
private_cidrs = [for i in [0,1,2]: cidrsubnet(var.vpc_cidr, 3, i)]
public_cidrs = [for i in [4,5,6]: cidrsubnet(var.vpc_cidr, 3, i)]
}
resource "aws_vpc" "main" {
cidr_block = var.vpc_cidr; enable_dns_hostnames = true
tags = { Name = "${var.project_name}-${var.environment}" }
}
resource "aws_subnet" "private" {
count = 3; vpc_id = aws_vpc.main.id
cidr_block = local.private_cidrs[count.index]
availability_zone = var.azs[count.index]
}
resource "aws_subnet" "public" {
count = 3; vpc_id = aws_vpc.main.id
cidr_block = local.public_cidrs[count.index]
availability_zone = var.azs[count.index]; map_public_ip_on_launch = false
}
Security group pattern (tiered access)
# Internet → ALB → App → RDS
resource "aws_security_group" "app" {
ingress { from_port = 8080; to_port = 8080; protocol = "tcp"
security_groups = [aws_security_group.alb.id] }
}
resource "aws_security_group" "rds" {
ingress { from_port = 5432; to_port = 5432; protocol = "tcp"
security_groups = [aws_security_group.app.id] }
}
Key decisions
- Multi-AZ NAT gateways for prod: ~$100/month but single failure = all private subnets down
- /20 per project: 4,096 IPs — not wasteful like /16, not cramped like /24
- Private subnets for everything: ECS, RDS, Lambda all private. Only ALBs and NAT GWs public
Step2Dev allocates non-overlapping CIDRs across all your accounts automatically.
What VPC decisions have caused you the most pain to undo?
Top comments (0)