AWS NAT Gateway costs $45/month each plus data fees. Here's how to slash costs by 90% using fck-nat with Terraformβfull HA setup included.
Quick question: Do you know how much your NAT Gateways cost?
Most teams don't realize they're spending $45/month per NAT Gateway plus $0.045/GB in data processing fees. A typical multi-AZ setup with 3 NAT Gateways processing 1TB/month costs:
3 NAT Gateways Γ $32.40/month = $97.20
Data processing: 1,000 GB Γ $0.045 = $45.00
Total monthly cost: $142.20
Annual cost: $1,706.40
For what? Giving your private subnet instances internet access.
There's a better way. Let me show you how to get the same functionality for $15/month using Terraform.
πΈ Why NAT Gateways Are So Expensive
NAT Gateway pricing has two components:
- Hourly charge: $0.045/hour per gateway = $32.40/month
- Data processing: $0.045/GB processed
For a production multi-AZ setup (3 availability zones):
- 3 NAT Gateways running 24/7: $97.20/month
- Data processing (1TB): $45/month
- Total: $142.20/month minimum
And that's before you process any serious traffic. Handle 5TB/month? Add another $225 in data fees.
π― The Solution: fck-nat
fck-nat is an open-source NAT solution that runs on a tiny EC2 instance. It does the exact same thing as NAT Gateway but costs ~90% less.
Cost comparison:
| Solution | Monthly Cost | Annual Cost |
|---|---|---|
| 3 NAT Gateways + 1TB data | $142 | $1,706 |
| 3 fck-nat instances (t4g.nano) | $15 | $180 |
| Savings | $127 | $1,526 |
And there's no data processing fee. Zero. Nada. π
π οΈ Terraform Implementation
Basic Single-AZ Setup (Simplest)
Start simple with one NAT instance:
# modules/fck-nat/main.tf
data "aws_ami" "fck_nat" {
most_recent = true
owners = ["568608671756"] # fck-nat AMI owner
filter {
name = "name"
values = ["fck-nat-al2023-*"]
}
filter {
name = "architecture"
values = ["arm64"] # ARM is cheaper
}
}
resource "aws_instance" "fck_nat" {
ami = data.aws_ami.fck_nat.id
instance_type = "t4g.nano" # $3/month, plenty of power
subnet_id = var.public_subnet_id
source_dest_check = false # Critical for NAT to work!
tags = {
Name = "fck-nat-instance"
}
}
resource "aws_eip" "fck_nat" {
domain = "vpc"
instance = aws_instance.fck_nat.id
tags = {
Name = "fck-nat-eip"
}
}
# Route table for private subnets
resource "aws_route_table" "private" {
vpc_id = var.vpc_id
route {
cidr_block = "0.0.0.0/0"
network_interface_id = aws_instance.fck_nat.primary_network_interface_id
}
tags = {
Name = "private-route-table"
}
}
resource "aws_route_table_association" "private" {
for_each = toset(var.private_subnet_ids)
subnet_id = each.value
route_table_id = aws_route_table.private.id
}
Deploy it:
terraform apply
# Total cost: ~$5/month (t4g.nano + EIP)
High-Availability Multi-AZ Setup (Production-Ready)
For production, you want HA across multiple AZs:
# modules/fck-nat-ha/main.tf
variable "availability_zones" {
description = "AZs to deploy NAT instances"
type = list(string)
default = ["us-east-1a", "us-east-1b", "us-east-1c"]
}
variable "public_subnet_ids" {
description = "Map of AZ to public subnet ID"
type = map(string)
}
variable "private_subnet_ids" {
description = "Map of AZ to list of private subnet IDs"
type = map(list(string))
}
data "aws_ami" "fck_nat" {
most_recent = true
owners = ["568608671756"]
filter {
name = "name"
values = ["fck-nat-al2023-*"]
}
filter {
name = "architecture"
values = ["arm64"]
}
}
# Security group for NAT instances
resource "aws_security_group" "fck_nat" {
name_prefix = "fck-nat-"
vpc_id = var.vpc_id
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = [var.vpc_cidr] # Allow from VPC
}
tags = {
Name = "fck-nat-sg"
}
}
# One NAT instance per AZ
resource "aws_instance" "fck_nat" {
for_each = toset(var.availability_zones)
ami = data.aws_ami.fck_nat.id
instance_type = "t4g.nano"
subnet_id = var.public_subnet_ids[each.key]
vpc_security_group_ids = [aws_security_group.fck_nat.id]
source_dest_check = false
tags = {
Name = "fck-nat-${each.key}"
}
lifecycle {
create_before_destroy = true
}
}
# Elastic IPs for each NAT instance
resource "aws_eip" "fck_nat" {
for_each = toset(var.availability_zones)
domain = "vpc"
instance = aws_instance.fck_nat[each.key].id
tags = {
Name = "fck-nat-eip-${each.key}"
}
}
# Route tables - one per AZ for fault isolation
resource "aws_route_table" "private" {
for_each = toset(var.availability_zones)
vpc_id = var.vpc_id
route {
cidr_block = "0.0.0.0/0"
network_interface_id = aws_instance.fck_nat[each.key].primary_network_interface_id
}
tags = {
Name = "private-rt-${each.key}"
}
}
# Associate private subnets with their AZ's route table
resource "aws_route_table_association" "private" {
for_each = {
for item in flatten([
for az, subnets in var.private_subnet_ids : [
for subnet in subnets : {
az = az
subnet = subnet
}
]
]) : "${item.az}-${item.subnet}" => item
}
subnet_id = each.value.subnet
route_table_id = aws_route_table.private[each.value.az].id
}
# Auto-recovery for failed instances
resource "aws_cloudwatch_metric_alarm" "auto_recover" {
for_each = toset(var.availability_zones)
alarm_name = "fck-nat-auto-recover-${each.key}"
comparison_operator = "GreaterThanThreshold"
evaluation_periods = 2
metric_name = "StatusCheckFailed_System"
namespace = "AWS/EC2"
period = 60
statistic = "Average"
threshold = 0
alarm_description = "Auto-recover fck-nat instance if system check fails"
alarm_actions = ["arn:aws:automate:${var.aws_region}:ec2:recover"]
dimensions = {
InstanceId = aws_instance.fck_nat[each.key].id
}
}
output "nat_instance_ids" {
value = { for az, instance in aws_instance.fck_nat : az => instance.id }
}
output "nat_public_ips" {
value = { for az, eip in aws_eip.fck_nat : az => eip.public_ip }
}
Usage Example
# main.tf
module "fck_nat" {
source = "./modules/fck-nat-ha"
vpc_id = aws_vpc.main.id
vpc_cidr = "10.0.0.0/16"
availability_zones = ["us-east-1a", "us-east-1b", "us-east-1c"]
aws_region = "us-east-1"
public_subnet_ids = {
"us-east-1a" = aws_subnet.public_a.id
"us-east-1b" = aws_subnet.public_b.id
"us-east-1c" = aws_subnet.public_c.id
}
private_subnet_ids = {
"us-east-1a" = [aws_subnet.private_a.id]
"us-east-1b" = [aws_subnet.private_b.id]
"us-east-1c" = [aws_subnet.private_c.id]
}
}
output "nat_details" {
value = {
instance_ids = module.fck_nat.nat_instance_ids
public_ips = module.fck_nat.nat_public_ips
}
}
Deploy it:
terraform init
terraform apply
# Cost: 3 Γ t4g.nano ($3/mo) + 3 Γ EIP ($0) = ~$15/month
# vs NAT Gateway: $142/month
# Savings: $127/month = $1,524/year π
π Cost Breakdown Comparison
NAT Gateway (Traditional AWS Approach)
3 NAT Gateways:
- 3 Γ $0.045/hour Γ 730 hours = $97.20
- Data processing: 1TB Γ $0.045 = $45.00
- Total: $142.20/month
Annual cost: $1,706.40
fck-nat (Optimized Approach)
3 t4g.nano instances:
- 3 Γ $0.0042/hour Γ 730 hours = $9.20
- 3 Γ EIP (in use) = $0.00
- Data processing = $0.00
- Total: $9.20/month
Annual cost: $110.40
Savings: $1,596/year (93% reduction!)
β‘ Performance Considerations
Q: Can a t4g.nano handle my traffic?
A: Almost certainly yes. Here's the math:
- t4g.nano baseline: 5% CPU, bursts to 100%
- Network performance: Up to 5 Gbps
- Typical NAT load: Very low CPU usage (mostly network I/O)
Real-world test: A single t4g.nano easily handles:
- 100+ Mbps sustained throughput
- 10,000+ concurrent connections
- 1TB+/month traffic
If you need more, upgrade to t4g.micro ($6/month) for 10% baseline and better burst credits.
π High Availability & Fault Tolerance
The HA setup includes:
β
Per-AZ NAT instances - Each AZ has its own NAT (like NAT Gateway)
β
Auto-recovery - CloudWatch alarms automatically recover failed instances
β
Fault isolation - Failure in one AZ doesn't affect others
β
Elastic IPs - Static IPs maintained across instance recovery
What happens if an instance fails?
- CloudWatch detects system status check failure (~2 minutes)
- EC2 auto-recovery launches replacement instance (~3-5 minutes)
- EIP automatically reattaches
- Total downtime: ~5-7 minutes (acceptable for most workloads)
For zero downtime, add Auto Scaling Groups:
# Optional: Zero-downtime with ASG (adds ~$3/month)
resource "aws_autoscaling_group" "fck_nat" {
for_each = toset(var.availability_zones)
name = "fck-nat-asg-${each.key}"
vpc_zone_identifier = [var.public_subnet_ids[each.key]]
min_size = 1
max_size = 1
desired_capacity = 1
launch_template {
id = aws_launch_template.fck_nat[each.key].id
version = "$Latest"
}
tag {
key = "Name"
value = "fck-nat-${each.key}"
propagate_at_launch = true
}
}
β οΈ When NOT to Use fck-nat
There are a few scenarios where NAT Gateway might be worth the cost:
- Extreme traffic: >10 Gbps sustained throughput (use NAT Gateway or multiple larger instances)
- Compliance requirements: Some regulations explicitly require AWS-managed services
- Zero-tolerance for downtime: Sub-minute failover SLA (though ASG setup gets close)
- No time for management: You value convenience over $1,500/year in savings
For 95% of use cases, fck-nat is the smarter choice.
π Migration Checklist
Switching from NAT Gateway to fck-nat:
Step 1: Deploy fck-nat alongside NAT Gateway
terraform apply -target=module.fck_nat
Step 2: Test with one private subnet
# Update one subnet's route table to point to fck-nat
# Test connectivity from instances in that subnet
curl -I https://api.github.com
Step 3: Migrate remaining subnets
# Update route tables one AZ at a time
terraform apply
Step 4: Remove NAT Gateways
# Comment out NAT Gateway resources
terraform destroy -target=aws_nat_gateway.main
Step 5: Celebrate savings π
# Watch your AWS bill drop next month
π‘ Pro Tips
1. Use ARM instances (t4g family)
t4g.nano is 20% cheaper than t3.nano and performs better for NAT workloads.
2. Enable detailed monitoring ($2/month per instance)
Worth it for better auto-recovery detection:
resource "aws_instance" "fck_nat" {
monitoring = true # Detailed CloudWatch metrics
}
3. Tag your NAT instances
Makes cost tracking easier:
tags = {
Name = "fck-nat-${each.key}"
Purpose = "NAT"
CostCenter = "networking"
Environment = "production"
}
4. Set up billing alerts
Get notified if traffic spikes unexpectedly:
resource "aws_cloudwatch_metric_alarm" "nat_traffic" {
alarm_name = "high-nat-traffic"
comparison_operator = "GreaterThanThreshold"
evaluation_periods = 1
metric_name = "NetworkOut"
namespace = "AWS/EC2"
period = 3600
statistic = "Sum"
threshold = 100000000000 # 100 GB/hour
alarm_description = "NAT instance processing >100GB/hour"
}
π Quick Start
Want to try it right now?
# Clone the fck-nat Terraform module
git clone https://github.com/AndrewGuenther/fck-nat.git
# Or use the examples from this article
mkdir fck-nat-setup
cd fck-nat-setup
# Copy the HA setup code from above into main.tf
# Update variables with your VPC/subnet IDs
terraform init
terraform plan # Review what will be created
terraform apply # Deploy it!
# Monitor your NAT instance
aws ec2 describe-instances \
--filters "Name=tag:Name,Values=fck-nat-*" \
--query 'Reservations[].Instances[].[InstanceId,State.Name,PublicIpAddress]' \
--output table
π Real-World Success Story
Before (NAT Gateway setup):
- 3 NAT Gateways across us-east-1a/b/c
- 2TB/month average traffic
- Monthly cost: $97 (hourly) + $90 (data) = $187/month
After (fck-nat setup):
- 3 t4g.nano instances
- Same 2TB/month traffic
- Monthly cost: $9/month
Annual savings: $2,136 π°
Time to implement: 2 hours
ROI: Literally infinite (one-time 2-hour investment)
π― Summary
| Factor | NAT Gateway | fck-nat | Winner |
|---|---|---|---|
| Cost (3 AZs) | $142/month | $15/month | π fck-nat |
| Data fees | $0.045/GB | $0/GB | π fck-nat |
| Setup complexity | Low | Medium | NAT Gateway |
| Performance | Unlimited | Up to 5 Gbps | NAT Gateway* |
| Management | Zero | Minimal | NAT Gateway |
| Annual savings | - | $1,524 | π fck-nat |
*For most workloads, 5 Gbps is more than enough
Bottom line: Unless you have extreme requirements, fck-nat saves you $1,500+/year with minimal effort.
Stop overpaying for NAT. Your AWS bill will thank you. π
Migrated from NAT Gateway to fck-nat? How much are you saving? Share in the comments! π¬
Follow for more AWS cost optimization with Terraform! β‘
Top comments (0)