Picture this. Your AWS bill hits, and there it is: $10K in NAT Gateway charges for 3 NAT GWs in us-east-1. You started to dig in, and see ~$8K comes from NatGateway-Bytes (Data Processed) alone, assuming most of it tied to ECR image pulls. I've helped teams spot this exact issue using Cost Explorer and VPC Flow logs, watching container deployments quietly eat budgets. The solution? Amazon ECR VPC endpoints. They dropped NAT bills by >75% in one setup I worked on. Let's walk through spotting it, the math, and the flow change.
TL;DR:
- ECR image pulls through NAT Gateways cost $0.045/GB.
- VPC Interface Endpoints cost $0.01/GB (78% cheaper).
- Real example: ~$8K/month β ~$2K/month = ~$70K annual savings.
π‘ Key Takeaways
The Problem: NAT Gateways charge $0.045/GB for data processing. For ECR-heavy workloads, this adds up fast, as our example case shows $8,010/month in data processing charges alone!
The Solution: Deploy three VPC endpoints to route ECR traffic privately:
-
ECR API Interface Endpoint (
com.amazonaws.<region>.ecr.api)- Handles authentication and image manifests
- Cost: ~$22/month per AZ + minimal data charges
- Required: Must deploy in each AZ for high availability
-
ECR Docker Interface Endpoint (
com.amazonaws.<region>.ecr.dkr)- Handles Docker pull/push commands
- Cost: ~$22/month per AZ + minimal data charges
- Required: Must deploy in each AZ for high availability
-
S3 Gateway Endpoint (
com.amazonaws.<region>.s3) β THE MOST CRITICAL ONE- Handles actual image layer downloads (99%+ of your data!)
- Cost: $0.00 (FREE!)
- Required: Without this, your image layers still hit NAT Gateways
-
The Savings: For 178,000 GB/month of ECR traffic:
- Before: $8,108.55/month (NAT Gateways)
- After: $1,823.80/month (VPC Endpoints)
- Savings: $6,284.75/month (77.5%) = $75,417/year
Why This Works?: ECR stores Docker image layers in S3. The free S3 Gateway endpoint handles 95%+ of your data transfer, while the two paid Interface endpoints handle control plane operations. All three work together to eliminate NAT Gateway data processing charges.
Implementation Time: ~30 minutes with Terraform, plus 48 hours to validate savings in Cost Explorer.
Critical Success Factor: You MUST deploy all three endpoints. Deploying only the ECR endpoints without the S3 Gateway endpoint will save you almost nothing because the bulk of your data will still flow through NAT Gateways
Let's start with the Brutal Math: NAT vs. Endpoints Head-to-Head
Think standard 3-AZ VPC with private subnets and container workloads. NAT charges $0.045 per hour per AZ plus $0.045 per GB processed. Endpoints run $0.01 per hour per ENI and $0.01 per GB. Much better for high volume.
Note: AWS requires 2 VPC interface endpoints per AZ for complete ECR private access: ecr.api, ecr.dkr, and s3 (layers), making it 6 ENIs total in a 3-AZ setup. The S3 Gateway endpoint modifies route tables and creates no ENIs. If you like to read more on this, follow links at the end of this post.
-
ecr.apiβ Interface endpoint (ENI per AZ) -
ecr.dkrβ Interface endpoint (ENI per AZ) -
s3β Gateway endpoint (NO ENIs, modifies route tables)
NAT Gateway vs VPC Endpoints Cost Comparison
Configuration: 3 AZs with 3 NAT Gateways vs 3 VPC Endpoints
VPC Endpoint Configuration:
- com.amazonaws..ecr.api (Interface) - $0.01/hour per AZ + $0.01/GB
- com.amazonaws..ecr.dkr (Interface) - $0.01/hour per AZ + $0.01/GB
- com.amazonaws..s3 (Gateway) - FREE (no hourly or data charges)
NAT Gateway Configuration:
- 3 NAT Gateways (one per AZ) - $0.045/hour each + $0.045/GB
Here's the model, scaled to $8K spend as data baseline (730 hours a month, 9 endpoints: 3 per AZ for ECR API, Docker, and S3):
| Data Volume (GB/mo) | NAT Cost ($) | VPC Endpoint Cost ($) | Monthly Savings ($) | Savings % |
|---|---|---|---|---|
| 100 | 103.05 | 44.80 | 58.25 | 56.5% |
| 500 | 121.05 | 48.80 | 72.25 | 59.7% |
| 1,000 | 143.55 | 53.80 | 89.75 | 62.5% |
| 5,000 | 323.55 | 93.80 | 229.75 | 71.0% |
| 10,000 | 548.55 | 143.80 | 404.75 | 73.8% |
| 50,000 | 2,348.55 | 543.80 | 1,804.75 | 76.8% |
| 100,000 | 4,598.55 | 1,043.80 | 3,554.75 | 77.3% |
| 178,000 | 8,108.55 | 1,823.80 | 6,284.75 | 77.5% |
Total NAT spend declines like a falling rock, at production scale, you will see ROI in days.
Example use case with assumptions
Assume we have 3 NAT Gateways in us-east-1 processing 178,000 GB of ECR traffic monthly.
Cost Breakdown for Total Monthly Cost: $8,108.55
-
NAT Gateway Hourly Charges: $98.55
- $0.045 per hour Γ 3 NAT Gateways Γ 730 hours/month
- This covers the provisioning cost for maintaining 3 NAT Gateways (one per AZ)
-
Data Processing Charges: $8,010.00
- $0.045 per GB Γ 178,000 GB
- This is the charge for processing all data flowing through the NAT Gateways
-
Per NAT Gateway:
- Hourly cost: $32.85/month per gateway
- Data processing (if evenly distributed): $2,670.00/month per gateway
Important Note: The data processing charge of $8,010 represents the vast majority (98.8%) of our assumed total NAT Gateway costs. Since we're processing ECR (Elastic Container Registry) traffic within the same region, we won't incur additional data transfer charges for the traffic itself, but the NAT Gateway data processing fee still applies.
Prerequisites:
- Private subnets with NAT Gateway access
- ECR repositories in the same region
- Security groups allowing HTTPS (443) from workloads
Hunt Down Those Hidden ECR Pull Fees
Start in AWS Cost Explorer. In Group by, select Dimension Usage Type, Filter to Service: EC2 - Other and Usage type group: for EC2: NAT Gateway - Data Processed and EC2: NAT Gateway - Running Hours. You'll see NatGateway-Bytes racking up that e.g. $8K at $0.045 per GB, plus NatGateway-Hours for the $0.045 hourly per AZ hit.
For proof, enable VPC Flow Logs on your subnets. Filter for port 443 traffic to ecr.api or ecr.dkr domains (Specifically, Look for destination port 443 traffic to IP addresses in the ECR service IP ranges, available via AWS IP ranges JSON).
Do you see private subnet bytes flooding NAT ENIs? That's the problem. Every pull sends a small request out via NAT, fetches metadata, then hauls gigabytes back, doubling up on processing fees. (If it is an Inter-AZ hop, it add $0.01 per GB more. Caught this pattern adding ~$3000 a month extra in a recent cluster review.)
Using VPC Flow Logs to Track and Validate ECR Traffic Costs
Before deploying VPC endpoints, you need proof that ECR is actually consuming your NAT Gateway bandwidth. After deployment, you need validation that traffic shifted correctly. VPC Flow Logs provide both.
Step 1: Enable VPC Flow Logs
Enable Flow Logs on your private subnets where container workloads run:
Via AWS CLI:
aws ec2 create-flow-logs \
--resource-type Subnet \
--resource-ids subnet-xxxxx subnet-yyyyy subnet-zzzzz \
--traffic-type ALL \
--log-destination-type cloud-watch-logs \
--log-group-name /aws/vpc/flowlogs \
--deliver-logs-permission-arn arn:aws:iam::ACCOUNT_ID:role/flowlogsRole
Via Terraform: : Follow link to see the module on terraform website
resource "aws_flow_log" "private_subnets" {
iam_role_arn = aws_iam_role.flow_logs.arn
log_destination = aws_cloudwatch_log_group.flow_logs.arn
traffic_type = "ALL"
vpc_id = aws_vpc.main.id
}
Step 2: Identify Top HTTPS Destinations
Run this CloudWatch Logs Insights query to find your highest-volume HTTPS destinations:
fields @timestamp, srcAddr, dstAddr, dstPort, bytes, action
| filter dstPort = 443
| filter interfaceId like /eni-/
| stats sum(bytes) as totalBytes by dstAddr
| sort totalBytes desc
| limit 50
This shows which destinations consume the most bandwidth on port 443. The top destinations are likely S3 IPs (for ECR image layers).
Step 3: Identify S3 and ECR Service IP Ranges
VPC Flow Logs show IP addresses, not domain names. Download AWS's IP ranges to identify both S3 and ECR traffic:
# Download AWS IP ranges
curl -o ip-ranges.json https://ip-ranges.amazonaws.com/ip-ranges.json
# Inspect services for your region
jq -r '.prefixes[] | select(.region=="us-east-1") | .service' ip-ranges.json | sort -u
Once you know the correct service values, narrow it down, since ECR doesn't have a designated value, we use AMAZON:
# Once you know the correct service values, narrow it down, for example:
jq -r '.prefixes[] | select(.service=="AMAZON" or .service=="S3" and .region=="us-east-1") | .ip_prefix' ip-ranges.json
Example IP ranges for us-east-1:
44.223.121.0/24
44.223.122.0/24
98.80.195.0/25
98.80.238.0/23
3.5.0.0/19
1.178.4.0/24
You will see >95% of traffic for S3:
- S3 (where ECR stores image layers - 95%+ of your traffic)
- ECR (API and Docker registry - <5% of your traffic) Why This Matters: Your 178,000 GB/month is primarily S3 traffic (image layer downloads), not ECR API calls. You must track S3 IPs to see the real cost impact!
(Always check the current AWS IP ranges JSON for your specific region)
Step 4: Calculate NAT Gateway ECR+S3 Traffic
Filter Flow Logs for traffic to BOTH S3 and ECR IPs through NAT Gateway ENIs:
NOTE:
- Do NOT copy paste as it is, update
filter dstAddr likeline to match the range from previus command output. - Replace
/^3\.5\./ or dstAddr like /^52\.94\./ or dstAddr like /^3\.5\./with real IPs you want to look for
fields @timestamp, srcAddr, dstAddr, dstPort, bytes, interfaceId
| filter dstPort = 443
| filter interfaceId like /eni-/ and action = "ACCEPT"
| filter dstAddr like /^3\.5\./ or dstAddr like /^52\.94\./ or dstAddr like /^3\.5\./
| stats sum(bytes) as totalBytes by interfaceId, dstAddr
| sort totalBytes desc
Identify NAT Gateway ENIs:
aws ec2 describe-nat-gateways --region us-east-1 \
--query 'NatGateways[].{NatGatewayId:NatGatewayId, NetworkInterfaceIds:NatGatewayAddresses[].NetworkInterfaceId}' \
--output table
Cross-reference the ENI IDs from your query results with NAT Gateway ENIs.
π‘ Pro Tip: The top destination IPs by bytes will be S3 ranges, not ECR ranges. This confirms that S3 Gateway endpoint is critical for cost savings!
Step 5: Calculate Monthly Cost Impact
From your Flow Logs query results:
- Sum total bytes through NAT Gateway ENIs to S3 + ECR IPs
- Convert to GB: totalBytes / 1,000,000,000 (AWS uses decimal GB)
- Calculate cost: GB Γ $0.045
Cost Calculation Example:
- Flow Logs show: 191,102,976,000 bytes to S3/ECR
- Convert: 191,102,976,000 / 1,000,000,000 = 191.10 GB
- For 178,000 GB/month: 178,000 Γ $0.045 = $8,010/month
Traffic Breakdown (typical):
- S3 image layers: ~177,850 GB (99.91%)
- ECR API calls: ~50 GB (0.03%)
- ECR Docker registry: ~100 GB (0.06%)
Step 6: Validate After VPC Endpoint Deployment
After deploying VPC endpoints, confirm traffic shifted to private IPs:
fields @timestamp, srcAddr, dstAddr, dstPort, bytes, interfaceId
| filter dstPort = 443
| filter dstAddr like /^10\./
| filter interfaceId like /eni-/
| stats sum(bytes) as totalBytes by interfaceId
| sort totalBytes desc
What you should see:
- β Traffic now goes to private 10.x.x.x IPs (VPC endpoint ENIs)
- β NAT Gateway ENIs show minimal S3/ECR traffic
- β Total bytes shifted from NAT to VPC endpoints
β But this validation method has problems β
β οΈ The above given filter only filters for RFC 1918 private IPs (10.0.0.0/8), but VPC endpoints use different address ranges:
Gateway Endpoints (S3, DynamoDB)
- Use prefix list routes (
pl-xxx), not destination IPs in flow logs - dstAddr shows the actual S3 service IP (public range like
52.x.x.x), not private - Flow log records bypass the interfaceId filter entirely because they hit the prefix list route directly
Interface Endpoints (ECR.api, ECR.dkr, etc.)
- Use PrivateLink IPs in the VPC CIDR (e.g.,
10.0.x.xif your VPC is10.0.0.0/16) - dstAddr shows the endpoint ENI IP (private), but only if your VPC CIDR starts with 10.
So what would correct validation queries look like?
1. Interface Endpoints (ECR, etc.) - Check PrivateLink traffic
fields @timestamp, srcAddr, dstAddr, dstPort, bytes, interfaceId
| filter dstPort = 443
| filter dstAddr like /^10\./ # Your VPC CIDR range
| filter interfaceId like /eni-/
| stats sum(bytes) as totalBytes by dstAddr, interfaceId
| sort totalBytes desc
β οΈ Only works if your VPC CIDR is 10.x.x.x. Replace with your actual CIDR (e.g., 172.16. or 192.168.).
2. Gateway Endpoints (S3) - Check prefix list bypass
fields @timestamp, srcAddr, dstAddr, dstPort, bytes, interfaceId
| filter dstPort = 443
| filter s3BucketName != "" or dstAddr like /s3\./ # S3 traffic
| filter interfaceId like /nat-/ == false # Not NAT ENIs
| stats sum(bytes) as totalBytes by dstAddr
| sort totalBytes desc
3. π― NAT Gateway traffic drop (The real validation)π―
fields @timestamp, srcAddr, dstAddr, dstPort, bytes, interfaceId
| filter dstPort = 443
| filter interfaceId like /nat-/
| stats sum(bytes) as totalBytes by interfaceId
| sort totalBytes desc
Before endpoints: High bytes on NAT ENIs
After endpoints: Bytes drop significantly on those same ENIs.
π― What success looks like π―
BEFORE endpoints:
- NAT ENI: 150 GB to s3.us-east-1.amazonaws.com
- NAT ENI: 25 GB to 123456789012.dkr.ecr.us-east-1.amazonaws.com
AFTER endpoints:
- NAT ENI: 5 GB (mostly external APIs)
- Interface ENI: 25 GB to 10.0.2.100 (ECR.dkr endpoint)
- S3 traffic: Prefix list route (no NAT ENI)
Key metric: NAT ENI bytes drop. That's your validation.
The /^10\./ filter only catches interface endpoints and only if your VPC uses that range. Use the NAT traffic reduction query instead.
Validate endpoint ENI IDs:
# ECR API endpoint ENIs
aws ec2 describe-vpc-endpoints --region us-east-1 \
--filters "Name=service-name,Values=com.amazonaws.us-east-1.ecr.api" \
--query 'VpcEndpoints[*].NetworkInterfaceIds' \
--output table
# ECR Docker endpoint ENIs
aws ec2 describe-vpc-endpoints --region us-east-1 \
--filters "Name=service-name,Values=com.amazonaws.us-east-1.ecr.dkr" \
--query 'VpcEndpoints[*].NetworkInterfaceIds' \
--output table
# S3 Gateway endpoint (no ENIs - modifies route tables)
aws ec2 describe-vpc-endpoints --region us-east-1 \
--filters "Name=service-name,Values=com.amazonaws.us-east-1.s3" \
--query 'VpcEndpoints[*].[VpcEndpointId,VpcEndpointType,RouteTableIds]' \
--output table
Step 7: Correlate with Cost Explorer
Confirm the cost impact in AWS Cost Explorer:
- Navigate to: Cost Explorer β Cost & Usage Reports
- Group by: Usage Type
- Filter Service: EC2 - Other
-
Look for:
- NatGateway-Bytes (should drop ~75%)
- VpcEndpoint-Bytes (should increase proportionally)
- Time range: Compare 2 weeks before vs 2 weeks after deployment
Expected results:
- NAT Gateway data processing: $8,010 β ~$2,000 (75% reduction)
- VPC Endpoint data processing: $0 β ~$1,780
- Net savings: ~$6,285/month
Understanding the Three-Endpoint Architecture
Why you need all three endpoints:
-
ECR API Interface Endpoint (
com.amazonaws.us-east-1.ecr.api)- Handles authentication, authorization, image manifests
- Low data volume (~50 GB/month)
- Cost: $21.90/month (3 AZs Γ 730 hrs Γ $0.01) + ~$0.50 data
-
ECR Docker Interface Endpoint (
com.amazonaws.us-east-1.ecr.dkr)- Handles Docker pull/push commands, layer discovery
- Low data volume (~100 GB/month)
- Cost: $21.90/month (3 AZs Γ 730 hrs Γ $0.01) + ~$1.00 data
-
S3 Gateway Endpoint (
com.amazonaws.us-east-1.s3) β THE CRITICAL ONE- Handles actual image layer downloads (99%+ of your data!)
- High data volume (~177,850 GB/month)
- Cost: $0.00 (FREE!) β This is where your savings come from! Without the S3 Gateway endpoint, your image layer downloads would still hit NAT Gateways even with ECR endpoints deployed!
Pro Tips for Flow Logs Analysis
- β Track S3 IPs, not just ECR IPs - S3 is where 95%+ of ECR data flows
- β Enable Flow Logs on private subnets only - Reduces log volume and costs
- β Use CloudWatch Logs Insights - Best for ad-hoc queries and quick analysis
- β Consider Amazon Athena - Better for large-scale historical analysis
- β Set up CloudWatch alarms - Alert on unexpected NAT traffic spikes
- β Tag your resources - Makes NAT Gateways and VPC endpoints easier to identify
- β Factor in Flow Logs cost - Approximately $0.50/GB ingested to CloudWatch
- β Aggregate by 5-minute intervals - Reduces log volume without losing insights
- β Monitor for 2-4 weeks - Ensures you capture full deployment cycles and traffic patterns
Before and After: Understanding The Traffic Flow
- Before: ECS Tasks β NAT Gateway β Internet β ECR/S3 (expensive)
- After: ECS Tasks β VPC Endpoints β AWS Private Network β ECR/S3 (optimized)
Before endpoints
- A pod in a private subnet hits NAT Gateway for every ECR pull
- Request goes outbound to the internet, ECR API replies inbound through NAT processing, then Docker layers stream back with massive GBs.
- Flow Logs show megabytes to NAT ENIs. Cost Explorer's
NatGateway-Bytesballoons to $8K.
After, deploy
-
com.amazonaws.<region>.ecr.apiand.ecr.dkrendpoints in each private subnet per AZ, turn on private DNS. - Pod traffic goes straight to the endpoint ENI via PrivateLink, no NAT or internet gateway.
- AWS backbone handles the rest, ECR layers flow free within the region.
- Flow Logs shift: zero NAT to ECR domains, all bytes on private 10.x endpoint IPs.
- In Cost Explorer, NAT usage drop like a falling rock.
- Look for usage types containing
VpcEndpoint-HoursandVpcEndpoint-Bytesunder the VPC service to confirm it is starting to show costs with much smaller amounts as compared to what NAT was showing.
Rolled this out on a Kubernetes fleet processing 178,000 GB/mo ECR traffic. NAT crashed from $10K ($8K data processed) to $2K for services that still need it. Endpoints totaled $1.8k. Filter Data Transfer + EC2 in Cost Explorer you will see EC2: NAT Gateway - Data Processed costs drop sharply, while VpcEndpoint-Hours + VpcEndpoint-Bytes take over at $0.01/GB.
Cost After VPC Interface Endpoints: $$1,823.80/month
New Cost Breakdown:
NAT Gateway Costs:
- Hourly charges: $98.55 (gateways remain for other traffic)
- Data processing: $0.00 (ECR traffic now bypasses NAT entirely) #### VPC Interface Endpoint Costs:
- Hourly charges: $43.80 (2 endpoints Γ 3 AZs Γ 730 hours Γ $0.01/hour)
- Data processing: $1,780.00 (178,000 GB Γ $0.01/GB) ## The Impact: π° Monthly Savings: $6,284.75/month (77.5%) π° Annual Savings: $75,417.00/year
What You Need to Deploy:
Required Interface Endpoints (per AZ):
- β com.amazonaws.us-east-1.ecr.api - For ECR API calls
- β com.amazonaws.us-east-1.ecr.dkr - For Docker registry operations #### Required Gateway Endpoint (VPC-wide - For ECR image layer storage - FREE):
- β com.amazonaws.us-east-1.s3 - Deploy once per VPC (not per AZ)
A quick and dirty example Terraform code"
resource "aws_vpc_endpoint" "s3" {
vpc_id = aws_vpc.main.id
service_name = "com.amazonaws.${var.aws_region}.s3"
vpc_endpoint_type = "Gateway"
route_table_ids = aws_route_table.private[*].id
policy = data.aws_iam_policy_document.s3_ecr_access.json
tags = {
Name = "s3-gateway"
}
}
resource "aws_vpc_endpoint" "ecr-dkr-endpoint" {
vpc_id = aws_vpc.main.id
service_name = "com.amazonaws.${var.aws_region}.ecr.dkr"
vpc_endpoint_type = "Interface"
private_dns_enabled = true
security_group_ids = [aws_security_group.ecs_task.id]
subnet_ids = aws_subnet.private[*].id
tags = {
Name = "ecr-dkr"
}
}
resource "aws_vpc_endpoint" "ecr-api-endpoint" {
vpc_id = aws_vpc.main.id
service_name = "com.amazonaws.${var.aws_region}.ecr.api"
vpc_endpoint_type = "Interface"
private_dns_enabled = true
security_group_ids = [aws_security_group.ecs_task.id]
subnet_ids = aws_subnet.private[*].id
tags = {
Name = "ecr-api"
}
}
Validation:
- Validate with:
nslookup ecr.api.us-east-1.amazonaws.com - Should resolve to private
10.x.x.xaddresses, not public IPs.
π‘ Pro Tip: The S3 Gateway endpoint is critical but FREE.
- Add a free S3 Gateway endpoint for ECR layer storage access. While ECR endpoints handle API calls, image layers are stored in S3. The Gateway endpoint ensures this traffic also bypasses NAT at zero cost, so don't skip it. ECR stores image layers in S3, and without this endpoint, your layer downloads will still hit NAT Gateways!
Why Does This Work So Well?
The key is data processing rate difference:
- NAT Gateway: $0.045/GB
- VPC Endpoint: $0.01/GB (78% cheaper per GB)
Plus, VPC endpoints provide:
- Better security - Traffic never leaves AWS network
- Lower latency - Direct path to ECR
- Higher reliability - No internet gateway dependency
- Simplified architecture - Private subnets can pull images directly
Another Implementation detail to keep in mind:
Your NAT Gateways stay in place for other internet-bound traffic (software updates, external APIs, etc.), but all ECR image pulls route through the VPC endpoints instead. This is a configuration change, not a replacement and you get the best of both worlds.
Troubleshooting:
- DNS not resolving privately? Enable "Private DNS" on endpoints β
- Still seeing NAT charges? Check security group rules allow 443 inbound β
- Pulls timing out? Verify subnet route tables don't force internet gateway β
- Endpoint not appearing in Cost Explorer? Wait 24-48 hours for billing data to populate; check under Service: "VPC" β
- Validate endpoint status:
aws ec2 describe-vpc-endpoints --filters "Name=service-name,Values=com.amazonaws.us-east-1.ecr.api"β
Troubleshooting Flow Logs Analysis
Issue: Can't find NAT Gateway ENIs in Flow Logs
- β Verify Flow Logs are enabled on the correct subnets
- β
Check that traffic-type is set to ALL (not just
ACCEPTorREJECT) - β Wait 10-15 minutes after enabling for data to populate
Issue: S3/ECR IP ranges don't match traffic
- β AWS IP ranges change periodically - always download the latest JSON
- β Some regions have additional IP ranges not in the main prefixes
- β Check for both IPv4 and IPv6 ranges if your VPC supports dual-stack
- β Remember: Most traffic will be to S3 IPs, not ECR IPs!
Issue: Traffic still shows NAT Gateway after endpoint deployment
- β Verify private_dns_enabled = true on Interface endpoints
- β Check security groups allow port 443 from workload subnets
- β Confirm route tables don't have explicit routes forcing internet gateway
- β Verify S3 Gateway endpoint is associated with correct route tables
- β
Test DNS resolution: nslookup
ecr.api.us-east-1.amazonaws.comshould return10.x.x.x - β
Test S3 access: nslookup
s3.us-east-1.amazonaws.comshould resolve (Gateway endpoints don't change DNS)
Issue: Cost Explorer doesn't match Flow Logs calculations
- β Flow Logs show raw bytes; Cost Explorer uses decimal GB (1 GB = 1,000,000,000 bytes)
- β Cost Explorer has 24-48 hour delay for billing data
- β Ensure you're comparing the same time periods
- β Check for data transfer charges vs data processing charges
- β Remember: S3 Gateway endpoint traffic is FREE, so you won't see it in VPC endpoint costs
Issue: Only seeing small data volumes to ECR IPs
- β This is NORMAL! ECR API/Docker traffic is <5% of total
- β The bulk of your data goes to S3 IPs (image layers)
- β If you're only filtering for ECR IPs, you're missing 95%+ of the traffic
- β Update your query to include S3 IP ranges
Reality Check
This assumes full traffic shift (realistic for ECR-only optimization). Background NAT persists for other internet traffic. Monitor your Cost Explorer's NAT Gateway data processing charges weekly for the first month. You should see a 75%+ drop if ECR is your primary NAT consumer. If not, investigate other high-volume services using VPC Flow Logs.
Next Steps
- Run Cost Explorer analysis (5 min)
- Deploy endpoints in non-prod (30 min)
- Validate with test pulls (10 min)
- Monitor for 48 hours
- Roll to production during maintenance window
- Track Cost Explorer for 2 weeks to confirm savings
Ready to fix it? Create the endpoints in console or Terraform, tag them like Name:ecr-api for tracking, test docker pull once private DNS propagates. Budget relief comes fast. Seen this work for you? Share in the comments.
References:



Top comments (0)