Stop Overpaying for AWS Data Transfer: A Guide to VPC Endpoints

#aws #architecture #dataengineering #cloudsecurity

If you are building data pipelines, you’ve probably seen a NAT Gateway charge that made you double-check your architecture.

While prepping for the AWS Solutions Architect Associate (SAA) exam, I’ve been diving into the "invisible" side of networking. We often assume that for a Lambda or an EC2 instance to talk to S3, it must go through the public internet.

This is a costly mistake.

The Problem: The "Public" Default

By default, services like S3, DynamoDB, or Kinesis live outside your VPC. To reach them from a private subnet, traffic usually flows through a NAT Gateway. This introduces:

Security Risks: Your data technically leaves your network perimeter.

Cost Inefficiency: You pay for every GB that passes through that NAT Gateway.

The Solution: VPC Endpoints (AWS PrivateLink)
VPC Endpoints allow you to create a private connection between your VPC and supported AWS services. The traffic never leaves the Amazon network.

Gateway Endpoints (The "OGs") These are specific to S3 and DynamoDB.

How they work: They don't use IPs. They work by adding a prefix list to your Route Table.

Cost: They are free. There is no reason not to use them.

Interface Endpoints (Powered by PrivateLink) These are for almost everything else (SNS, SQS, Kinesis, Athena).

How they work: They provision an Elastic Network Interface (ENI) with a private IP in your subnet.

Cost: There is an hourly charge and a data processing charge. However, for high-volume data pipelines (like Kinesis streams), they are often significantly cheaper than NAT Gateway transfer fees.

Real-World Use Case: The Data Lake Ingestion

Imagine a fleet of producers in a private subnet sending TBs of data to S3 and Kinesis.

Without Endpoints: You pay for NAT Gateway processing for every single byte.

With Endpoints:

Your S3 traffic is free via Gateway Endpoints.
Your Kinesis traffic is private and potentially cheaper via Interface Endpoints.

Moving to "Secure by Design"
A junior architect builds a connection. A senior architect builds a secure pipe.

VPC Endpoints allow you to attach Endpoint Policies. This is where you move from "it works" to "it's bulletproof." You can define exactly which IAM principal can access which specific resource through that endpoint.

🛡️ The Policy: Granular Control
As promised, here is an example of a VPC Endpoint Policy for S3.

This policy ensures that the endpoint can only be used to access a specific production bucket, preventing data exfiltration to unauthorized accounts even if an attacker gains access to your compute resources.

JSON
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "RestrictToSpecificBucket",
"Effect": "Allow",
"Principal": "",
"Action": [
"s3:GetObject",
"s3:PutObject"
],
"Resource": [
"arn:aws:s3:::my-production-data-bucket",
"arn:aws:s3:::my-production-data-bucket/"
]
}
]
}

Why this matters:
Even if a developer accidentally leaks an IAM Key with S3:* permissions, those keys cannot be used through this endpoint to upload data to a personal bucket. The "pipe" itself is now intelligent.

What about you?
Have you ever been surprised by a NAT Gateway bill? Or are you currently wrestling with Endpoint Policies for multi-region setups? Let's discuss below! ☕👇

DEV Community

Stop Overpaying for AWS Data Transfer: A Guide to VPC Endpoints

Top comments (0)