Introduction: Why Build a VPC from Scratch?
Amazon Web Services revolutionized cloud computing with Virtual Private Clouds (VPCs), allowing users to create isolated network environments in the cloud. But have you ever wondered how VPCs actually work under the hood?
In this project, I recreated AWS VPC functionality on a single Linux machine using native networking primitives. No Docker, no Kubernetes—just pure Linux networking: network namespaces, bridges, veth pairs, and iptables.
What You'll Learn
- How network isolation works at the kernel level
- Linux network namespaces as lightweight containers
- Bridging and routing fundamentals
- NAT implementation with iptables
- Building infrastructure automation tools
Real-World Applications
- Understanding cloud provider networking internals - See how AWS/Azure/GCP implement VPCs
- Building custom network isolation for multi-tenant systems
- DevOps and infrastructure automation skills - Create your own networking tools
- Debugging complex network issues - Deep knowledge of Linux networking stack
Architecture Overview
A VPC in AWS provides:
- Isolated network space with your own IP range (CIDR)
- Subnets that partition your VPC into smaller networks
- Internet Gateway for public subnet internet access
- NAT Gateway for private subnet outbound access
- Security Groups for firewall rules
- VPC Peering for cross-VPC communication
My Implementation Stack
┌─────────────────────────────────────────────┐
│ Linux Host System │
│ │
│ ┌────────────────────────────────────────┐ │
│ │ VPC1 (10.0.0.0/16) │ │
│ │ │ │
│ │ br-vpc1 (Linux Bridge = VPC Router) │ │
│ │ │ │ │ │
│ │ veth-pair veth-pair │ │
│ │ │ │ │ │
│ │ ┌─────▼────┐ ┌─────▼────┐ │ │
│ │ │ Public │ │ Private │ │ │
│ │ │ Subnet │ │ Subnet │ │ │
│ │ │ Namespace│ │ Namespace│ │ │
│ │ │10.0.1.2 │ │10.0.2.2 │ │ │
│ │ └─────┬────┘ └─────┬────┘ │ │
│ │ │ X │ │
│ │ NAT (iptables) No Internet │ │
│ └────────┼────────────────────────────────┘│
│ │ │
│ [eth0] ──► Internet │
└─────────────────────────────────────────────┘
Component Mapping
| AWS VPC Concept | Linux Implementation |
|---|---|
| VPC | Linux Bridge (br-vpc1) |
| Subnet | Network Namespace |
| Subnet Connection | veth pair |
| Internet Gateway | iptables NAT MASQUERADE |
| Route Table | ip route commands |
| Security Group | iptables INPUT rules |
| VPC Peering | veth pair between bridges |
Part 1: Understanding Network Namespaces
Network namespaces are Linux's way of creating isolated network stacks. Each namespace has its own:
- Network interfaces
- IP addresses
- Routing tables
- iptables rules
Why This Matters: This is the same technology Docker uses for container networking. Understanding this gives you deep insights into containerization.
Creating Isolation
# In vpcctl.py - SubnetManager.add()
run_command(f"ip netns add {ns_name}") # Create isolated namespace
When you create a namespace, the Linux kernel creates a completely separate network stack. Processes inside can't see or access the host's network—perfect isolation!
Test It Yourself
# Create namespace
sudo ip netns add test-ns
# List interfaces in host
ip link
# List interfaces in namespace (only loopback!)
sudo ip netns exec test-ns ip link
You'll notice the namespace starts with only a loopback interface. It's completely isolated!
Part 2: Connecting Namespaces - The veth Pair Magic
Network namespaces are isolated, so we need a way to connect them. Enter veth pairs—virtual ethernet cables.
Conceptual Model
Think of a veth pair as a virtual ethernet cable with two ends:
- One end plugs into the namespace
- Other end plugs into the host or bridge
# Creating the connection
run_command(f"ip link add {veth_host} type veth peer name {veth_ns}")
run_command(f"ip link set {veth_host} master {bridge}") # Connect to bridge
run_command(f"ip link set {veth_ns} netns {ns_name}") # Move to namespace
Why This Works: Packets entering one end of the veth pair automatically come out the other end—like a wormhole for network traffic.
IP Address Assignment
net_info = NetworkUtils.get_network_info(cidr)
# Assign IP inside namespace
run_command(
f"ip netns exec {ns_name} ip addr add "
f"{net_info['first_host']}/{net_info['prefix']} dev {veth_ns}"
)
Key Insight: We use the first usable IP (.2) for the subnet, and the bridge gets the gateway IP (.1). This mirrors how AWS assigns IPs in subnets.
Part 3: The Bridge - Your VPC Router
A Linux bridge is like a virtual network switch. It forwards packets between connected interfaces at Layer 2.
Bridge Creation
bridge_name = f"br-{name}"
run_command(f"ip link add {bridge_name} type bridge")
run_command(f"ip addr add {net_info['gateway']}/{net_info['prefix']} dev {bridge_name}")
run_command(f"ip link set {bridge_name} up")
Critical Configuration
# Disable bridge netfilter - allows direct L2 forwarding
run_command("sysctl -w net.bridge.bridge-nf-call-iptables=0")
Why This Matters: By default, Linux bridges pass traffic through iptables. For intra-VPC communication, we want direct Layer 2 switching for performance—just like a real network switch.
Routing Setup
# Inside namespace: route everything through bridge gateway
run_command(f"ip netns exec {ns_name} ip route add default via {gateway_ip}")
This makes the bridge act as the default gateway—all traffic from namespaces flows through it.
Part 4: NAT Gateway - Internet Access
Private networks use RFC 1918 addresses (10.x.x.x, 172.16.x.x, 192.168.x.x) that aren't routable on the internet. NAT (Network Address Translation) solves this.
NAT Implementation
# Enable IP forwarding (routing between interfaces)
run_command("sysctl -w net.ipv4.ip_forward=1")
# MASQUERADE: Replace source IP with host's public IP
run_command(
f"iptables -t nat -A POSTROUTING -s {cidr} -o {interface} -j MASQUERADE"
)
# Allow forwarding from VPC to internet
run_command(
f"iptables -A FORWARD -s {cidr} -i {bridge} -o {interface} -j ACCEPT"
)
# Allow return traffic
run_command(
f"iptables -A FORWARD -d {cidr} -i {interface} -o {bridge} "
f"-m state --state RELATED,ESTABLISHED -j ACCEPT"
)
How MASQUERADE Works
- Packet leaves namespace with source IP
10.0.1.2 - Reaches host via bridge
- iptables MASQUERADE rewrites source to host's public IP
- Internet sees request from host, not internal IP
- Response comes back, iptables rewrites destination back to
10.0.1.2 - Packet forwarded to namespace
Key Insight: We only NAT public subnets. Private subnets remain isolated—they can reach other subnets within the VPC but not the internet.
Part 5: VPC Isolation & Peering
Default Isolation
Without configuration, VPCs can't communicate. Why?
- Each VPC has its own bridge
- No routes exist between bridges
- iptables FORWARD policy is DROP by default
Testing Isolation
# From vpc1 namespace, try to reach vpc2
sudo ip netns exec vpc1-public ping 172.16.1.2
# Result: Network unreachable (no route to 172.16.0.0/16)
VPC Peering Implementation
The tricky part: packets from a namespace need to reach another VPC's bridge.
Solution:
# Create veth pair between bridges
run_command(f"ip link add {veth1} type veth peer name {veth2}")
run_command(f"ip link set {veth1} master {vpc1['bridge']}")
run_command(f"ip link set {veth2} master {vpc2['bridge']}")
# CRITICAL: Add routes in EACH namespace
for subnet in vpc1["subnets"]:
ns_name = subnet["namespace"]
run_command(
f"ip netns exec {ns_name} ip route add {vpc2['cidr']} "
f"via {vpc1['gateway']}"
)
Why This Works
- Namespace sends packet to VPC2 CIDR
- Route points to its own gateway (bridge)
- Bridge forwards to peering veth pair
- Packet arrives at VPC2's bridge
- VPC2 bridge forwards to destination namespace
Common Mistake I Made: Initially, I added routes on the host routing table. This doesn't work because packets originate from inside namespaces, which have their own routing tables!
Part 6: Firewall Rules (Security Groups)
AWS Security Groups are stateful firewalls. We simulate this with iptables inside namespaces.
JSON Policy Design
{
"subnet": "10.0.1.0/24",
"ingress": [
{"port": 80, "protocol": "tcp", "action": "allow"},
{"port": 22, "protocol": "tcp", "action": "deny"}
]
}
Implementation
# Essential: Allow established connections (stateful behavior)
run_command(
f"ip netns exec {ns_name} iptables -A INPUT "
f"-m state --state ESTABLISHED,RELATED -j ACCEPT"
)
# Apply custom rules
for rule in policy.get("ingress", []):
port = rule["port"]
protocol = rule["protocol"]
action = "ACCEPT" if rule["action"] == "allow" else "DROP"
run_command(
f"ip netns exec {ns_name} iptables -A INPUT "
f"-p {protocol} --dport {port} -j {action}"
)
Why ESTABLISHED,RELATED Matters: Without this, responses to outbound connections would be blocked. The stateful rule tracks connections and allows return traffic automatically.
The CLI Tool: vpcctl
I built a Python CLI tool to automate all VPC operations.
Installation
git clone https://github.com/yourusername/vpcctl-project.git
cd vpcctl-project
chmod +x vpcctl.py
Usage Examples
Create VPC:
sudo ./vpcctl.py vpc create --name vpc1 --cidr 10.0.0.0/16
Add Subnets:
# Public subnet
sudo ./vpcctl.py subnet add \
--vpc vpc1 \
--name public \
--cidr 10.0.1.0/24 \
--type public
# Private subnet
sudo ./vpcctl.py subnet add \
--vpc vpc1 \
--name private \
--cidr 10.0.2.0/24 \
--type private
Enable NAT Gateway:
sudo ./vpcctl.py nat enable --vpc vpc1 --interface eth0
Deploy Web Server:
sudo ./vpcctl.py deploy webserver \
--vpc vpc1 \
--subnet public \
--port 8080
Create VPC Peering:
sudo ./vpcctl.py peer create --vpc1 vpc1 --vpc2 vpc2
Apply Firewall Rules:
sudo ./vpcctl.py firewall apply \
--vpc vpc1 \
--subnet public \
--policy policy.json
List All VPCs:
sudo ./vpcctl.py vpc list
Delete VPC:
sudo ./vpcctl.py vpc delete --name vpc1
Testing & Validation
Test 1: Intra-VPC Communication
# Create VPC with two subnets
sudo ./vpcctl.py vpc create --name vpc1 --cidr 10.0.0.0/16
sudo ./vpcctl.py subnet add --vpc vpc1 --name public \
--cidr 10.0.1.0/24 --type public
sudo ./vpcctl.py subnet add --vpc vpc1 --name private \
--cidr 10.0.2.0/24 --type private
# Test connectivity
sudo ip netns exec vpc1-public ping -c 3 10.0.2.2
✅ Expected: Success (same VPC, bridge routes traffic)
What's Happening:
- Packet leaves public namespace (10.0.1.2)
- Goes through veth to bridge
- Bridge forwards to private subnet's veth
- Arrives at private namespace (10.0.2.2)
Test 2: NAT Gateway
# Enable NAT
sudo ./vpcctl.py nat enable --vpc vpc1 --interface eth0
# Test public subnet internet access
sudo ip netns exec vpc1-public ping -c 3 8.8.8.8
✅ Expected: Success
# Test private subnet (should fail)
sudo ip netns exec vpc1-private ping -c 2 8.8.8.8
✅ Expected: Timeout (no NAT rule for private subnet)
Verification:
# Check NAT rules
sudo iptables -t nat -L POSTROUTING -n -v
# Should show MASQUERADE rule for 10.0.1.0/24 only
Test 3: VPC Isolation
# Create second VPC
sudo ./vpcctl.py vpc create --name vpc2 --cidr 172.16.0.0/16
sudo ./vpcctl.py subnet add --vpc vpc2 --name public \
--cidr 172.16.1.0/24 --type public
# Try to communicate (should fail)
sudo timeout 3 ip netns exec vpc1-public ping 172.16.1.2
✅ Expected: Network unreachable
Test 4: VPC Peering
# Create peering
sudo ./vpcctl.py peer create --vpc1 vpc1 --vpc2 vpc2
# Now communication should work
sudo ip netns exec vpc1-public ping -c 3 172.16.1.2
✅ Expected: Success
# Verify routes were added
sudo ip netns exec vpc1-public ip route
# Should show: 172.16.0.0/16 via 10.0.0.1
Test 5: Firewall Rules
# Deploy web server
sudo ./vpcctl.py deploy webserver --vpc vpc1 --subnet public --port 8080
# Test before firewall
curl http://10.0.1.2:8080
✅ Works
# Apply restrictive policy
sudo ./vpcctl.py firewall apply \
--vpc vpc1 \
--subnet public \
--policy test-policy.json
# Test allowed port
curl http://10.0.1.2:8080
✅ Still works (port 8080 is allowed in policy)
Complete Demo Script
Here's the full automated demo:
#!/bin/bash
# 1. Create VPC
sudo ./vpcctl.py vpc create --name vpc1 --cidr 10.0.0.0/16
# 2. Add subnets
sudo ./vpcctl.py subnet add --vpc vpc1 --name public \
--cidr 10.0.1.0/24 --type public
sudo ./vpcctl.py subnet add --vpc vpc1 --name private \
--cidr 10.0.2.0/24 --type private
# 3. Enable NAT
IFACE=$(ip route | grep default | awk '{print $5}' | head -1)
sudo ./vpcctl.py nat enable --vpc vpc1 --interface $IFACE
# 4. Deploy web servers
sudo ./vpcctl.py deploy webserver --vpc vpc1 --subnet public --port 8080
sudo ./vpcctl.py deploy webserver --vpc vpc1 --subnet private --port 8081
# 5. Test intra-VPC communication
sudo ip netns exec vpc1-public ping -c 3 10.0.2.2
# 6. Test NAT gateway
sudo ip netns exec vpc1-public ping -c 3 8.8.8.8
sudo timeout 3 ip netns exec vpc1-private ping -c 2 8.8.8.8
# 7. Create second VPC
sudo ./vpcctl.py vpc create --name vpc2 --cidr 172.16.0.0/16
sudo ./vpcctl.py subnet add --vpc vpc2 --name public \
--cidr 172.16.1.0/24 --type public
sudo ./vpcctl.py nat enable --vpc vpc2 --interface $IFACE
# 8. Test VPC isolation
sudo timeout 3 ip netns exec vpc1-public ping -c 2 172.16.1.2
# 9. Create VPC peering
sudo ./vpcctl.py peer create --vpc1 vpc1 --vpc2 vpc2
# 10. Test after peering
sudo ip netns exec vpc1-public ping -c 3 172.16.1.2
# 11. Apply firewall rules
sudo ./vpcctl.py firewall apply --vpc vpc1 --subnet public \
--policy test-policy.json
# 12. View logs
tail -30 /var/lib/vpcctl/vpcctl.log
# 13. List resources
sudo ./vpcctl.py vpc list
# 14. Cleanup
sudo ./vpcctl.py vpc delete --name vpc1
sudo ./vpcctl.py vpc delete --name vpc2
Cleanup Process
Proper cleanup is critical. Orphaned namespaces and iptables rules can cause issues.
# Delete VPC (automated cleanup)
sudo ./vpcctl.py vpc delete --name vpc1
What happens internally:
- Kill all processes in namespaces
- Delete namespaces
- Remove veth pairs
- Delete iptables rules
- Remove bridge
- Clean up state file
Emergency Cleanup
sudo ./cleanup.sh
# Removes ALL network resources created by vpcctl
Verification
# Should be empty
ip netns list
ip link show type bridge | grep br-
# Check iptables
sudo iptables -L FORWARD -n
sudo iptables -t nat -L POSTROUTING -n
Challenges & Solutions
Challenge 1: VPC Peering Not Working
Problem: Routes added to host routing table, but packets originate from namespaces.
Solution: Add routes inside each namespace pointing to their respective gateways.
# Wrong approach
run_command(f"ip route add {vpc2['cidr']} via {vpc2_peer_ip}")
# Correct approach
for subnet in vpc1["subnets"]:
ns_name = subnet["namespace"]
run_command(
f"ip netns exec {ns_name} ip route add {vpc2['cidr']} "
f"via {vpc1['gateway']}"
)
Challenge 2: Firewall Blocking Everything
Problem: Applied rules but forgot to allow established connections.
Solution: Always add stateful rules first:
run_command(
f"ip netns exec {ns_name} iptables -A INPUT "
f"-m state --state ESTABLISHED,RELATED -j ACCEPT"
)
Challenge 3: Bridge Netfilter Interference
Problem: iptables was processing bridge traffic, causing performance issues.
Solution: Disable bridge netfilter:
sysctl -w net.bridge.bridge-nf-call-iptables=0
Key Takeaways
- Network Namespaces provide true isolation—the foundation of containers
- veth Pairs are the glue connecting isolated environments
- Bridges act as virtual switches for Layer 2 forwarding
- iptables is incredibly powerful for NAT, routing, and firewalling
- Proper cleanup is essential for infrastructure automation
Real-World Applications
- Container Networking: Docker/Kubernetes use the same primitives
- Multi-tenant Systems: Isolate customer workloads
- Cloud Provider Internals: Understanding how AWS VPC really works
- Security: Network segmentation and isolation strategies
Architecture Deep Dive
Let me explain how packets flow through the system:
Scenario 1: Public Subnet → Internet
[Namespace] 10.0.1.2
↓ (veth pair)
[Bridge] br-vpc1 (10.0.0.1)
↓ (routing decision)
[iptables NAT] MASQUERADE (rewrites source IP)
↓
[eth0] → Internet
Scenario 2: Namespace → Namespace (Same VPC)
[Namespace A] 10.0.1.2
↓ (veth pair)
[Bridge] br-vpc1 (L2 switching)
↓ (veth pair)
[Namespace B] 10.0.2.2
Scenario 3: VPC Peering
[VPC1 Namespace] 10.0.1.2
↓ (veth pair)
[VPC1 Bridge] br-vpc1
↓ (peering veth pair)
[VPC2 Bridge] br-vpc2
↓ (veth pair)
[VPC2 Namespace] 172.16.1.2
Performance Considerations
Bridge vs Router
Using bridges instead of routing gives us:
- Lower latency - Layer 2 switching is faster than Layer 3 routing
- Higher throughput - No routing table lookups for intra-VPC traffic
- Simpler configuration - Bridges handle MAC learning automatically
Namespace Overhead
Network namespaces are lightweight:
- ~1KB memory per namespace
- Negligible CPU overhead for namespace switching
- Near-native performance for network operations
Scalability Limits
On a typical Linux system:
- ~100,000+ namespaces possible
- Limited by file descriptors and memory
- iptables rules become the bottleneck at scale (~10,000+ rules)
Security Considerations
Isolation Guarantees
Network namespaces provide:
- Complete network stack isolation
- Separate iptables rules
- Independent routing tables
- Process isolation (can't see other namespace processes)
Attack Surface
Potential security concerns:
- Host compromise affects all VPCs
- Bridge vulnerabilities (MAC flooding, ARP spoofing)
- iptables misconfigurations can leak traffic
Best Practices
- Least privilege - Only enable NAT for public subnets
- Default deny - Block all traffic, then allow specific flows
- Audit logging - Log all iptables rules and changes
- Regular cleanup - Remove unused resources
Future Enhancements
What's Missing?
Compared to AWS VPC, this implementation lacks:
-
IPv6 Support
- Current: IPv4 only
- Enhancement: Dual-stack networking
-
DNS Server
- Current: No internal DNS
- Enhancement: dnsmasq in each VPC
-
DHCP
- Current: Static IP assignment
- Enhancement: Dynamic IP allocation
-
Network ACLs
- Current: Security groups only
- Enhancement: Subnet-level firewalls
-
VPC Flow Logs
- Current: Basic logging
- Enhancement: Detailed traffic logging
-
Elastic IPs
- Current: No persistent public IPs
- Enhancement: Static IP mapping
Implementation Ideas
DNS Server:
def add_dns_server(vpc_name):
# Start dnsmasq in namespace
run_command(
f"ip netns exec {vpc_name}-dns "
f"dnsmasq --interface=lo --bind-interfaces "
f"--listen-address=10.0.0.2"
)
DHCP Server:
def add_dhcp_server(vpc_name, cidr):
# Configure dnsmasq for DHCP
run_command(
f"ip netns exec {vpc_name}-dhcp "
f"dnsmasq --dhcp-range={start_ip},{end_ip},12h"
)
Comparison with Real Cloud VPCs
| Feature | My Implementation | AWS VPC |
|---|---|---|
| Isolation | ✅ Network namespaces | ✅ Hypervisor-level |
| Subnets | ✅ Multiple per VPC | ✅ Multiple per VPC |
| NAT Gateway | ✅ iptables MASQUERADE | ✅ Managed NAT service |
| VPC Peering | ✅ veth pairs | ✅ Software-defined networking |
| Security Groups | ✅ iptables rules | ✅ Stateful firewall |
| Network ACLs | ❌ Not implemented | ✅ Subnet-level firewall |
| DNS | ❌ Not implemented | ✅ Route 53 integration |
| DHCP | ❌ Static IPs only | ✅ DHCP options |
| Flow Logs | ⚠️ Basic logging | ✅ Detailed flow logs |
| IPv6 | ❌ IPv4 only | ✅ Dual-stack |
| Multi-region | ❌ Single host | ✅ Global infrastructure |
| HA/Redundancy | ❌ Single point of failure | ✅ Multi-AZ redundancy |
Learning Resources
Official Documentation
Books
- Linux Networking Cookbook by Carla Schroder
- TCP/IP Illustrated by W. Richard Stevens
- Linux Kernel Networking by Rami Rosen
Online Courses
- Linux Foundation: Linux Networking and Administration
- Pluralsight: Linux Networking Fundamentals
- Udemy: Linux Networking Masterclass
Project Repository
GitHub: https://github.com/cypher682/vpcctl-project
Video Demo: [Your video link here]
Repository Structure
vpcctl-project/
├── README.md # Complete documentation
├── vpcctl # Main CLI tool
├── demo.sh # Automated demo
├── cleanup.sh # Emergency cleanup
├── docs/
│ ├── architecture-diagram.png
Conclusion
Building a VPC from scratch taught me more about networking than reading documentation ever could. Understanding these primitives—namespaces, bridges, veth pairs, and iptables—gives you superpowers when debugging container networking, understanding cloud provider internals, or designing multi-tenant systems.
Key Lessons:
- Linux networking is powerful - You don't need specialized tools for complex networking
- Cloud abstractions are implementations - AWS VPC is just well-packaged Linux networking
- Isolation is achievable - Network namespaces provide true isolation
- Automation is essential - Infrastructure as code makes everything reproducible
Got questions? Drop them in the comments below! 👇
Found this useful? Star the repo and share with fellow DevOps engineers! ⭐
- GitHub:https://github.com/cypher682)
Top comments (0)