đ Executive Summary
TL;DR: Traditional hub-and-spoke cloud networking often creates bottlenecks and increases the blast radius of misconfigurations as organizations scale. The article explores alternatives like direct VPC peering for specific high-bandwidth needs, modern hub-and-spoke with AWS Transit Gateway for scalable solutions, and multi-hub or Cloud WAN for global enterprises to overcome these limitations.
đŻ Key Takeaways
- Classic hub-and-spoke architectures can become central bottlenecks, increase the blast radius of misconfigurations, and introduce bureaucratic delays for network changes.
- Direct VPC peering is a quick, low-latency solution for a small number of VPCs (2-4) with high-traffic requirements, but its overuse leads to unmanageable âspaghetti networkingâ.
- AWS Transit Gateway (TGW) offers a modern, scalable hub-and-spoke solution, eliminating bottlenecks, providing granular routing, and efficiently handling spoke-to-spoke communication.
- For large, multi-region enterprises, advanced topologies like multi-hub TGW deployments or AWS Cloud WAN are necessary to create a global backbone and segment networks effectively.
Hub-and-spoke isnât the only answer for cloud networking. Weâll explore why teams stray from this classic model and look at practical alternatives like direct peering and multi-hub designs for when things get complicated.
So, Youâre Thinking of Ditching Hub-and-Spoke? A Reality Check.
I still remember the 2 AM page. It was one of those cryptic âdatabase unreachableâ alerts from the prod-analytics cluster. The on-call engineer, a sharp but still junior guy, was completely stumped. The app servers were up, the database prod-db-01 was healthy, but nothing could connect. After 45 minutes of frantic digging, we found it: someone on the central platform team had pushed a âminorâ firewall rule update to the hub VPC. It was supposed to lock down a test environment, but a typo in the CIDR range blackholed all traffic from the analytics spoke to the shared services spoke where the DB lived. An entire production system was down because of a single, seemingly unrelated change. That night, I understood why every engineer eventually asks, âDo we *really* need this hub-and-spoke thing?â
The âWhyâ: Whatâs Wrong With The Textbook Answer?
Letâs be clear: hub-and-spoke is the default for a reason. It centralizes security, simplifies DNS, and gives you a single place to manage egress and shared services. When youâre small, itâs perfect. But as you grow, this beautiful, clean diagram starts to develop some nasty habits:
- The Central Bottleneck: Every single packet that needs to go from one spoke to another has to travel into the hub and back out. This can introduce latency and, more importantly, create a massive traffic chokepoint.
- The Blast Radius of Doom: Like my war story, a single misconfiguration in the hub VPC (a bad route, a fat-fingered firewall rule) can take down every single application connected to it.
- The Bureaucracy Chokehold: The âhubâ is often owned by a central âPlatformâ or âNetworkâ team. Need to open a port between your app and a new data service? Get in line and fill out a ticket. Innovation slows to a crawl.
So when a developer comes to you saying âI just need my two VPCs to talk,â theyâre not being difficult. Theyâre trying to escape the chokehold. The good news is, you have options.
Solution 1: The Quick Fix â Direct VPC Peering
This is the âI need it working yesterdayâ solution. A team has prod-webapp-vpc and prod-ml-training-vpc, and they need to share a high-bandwidth connection without going through the central hub. Instead of waiting for the network team, you just connect them directly.
Itâs simple, fast, and effective for a point-to-point problem. You create a peering connection, and both sides accept. Then you just update the route tables in each VPC to point to the otherâs CIDR block via the peering connection.
Warning: This is the path to âspaghetti networking.â If you do this for two or three VPCs, itâs manageable. If you start peering every VPC with every other VPC, you create an unmanageable mesh of connections with no central control. Use this surgically, not as a default strategy.
When to use it:
For a small number of VPCs (think 2-4) that have a very high-traffic, low-latency requirement between them and donât need access to many other shared services.
Solution 2: The Permanent Fix â The Modern Hub & Spoke with Transit Gateway
Most of the problems people have with âhub-and-spokeâ are actually problems with the *old* way of doing itâusing a regular VPC with a software firewall/router as the hub. The modern solution is a dedicated managed service like AWS Transit Gateway (TGW).
A TGW isnât a VPC; itâs a regional network router. You attach all your VPCs (spokes) to it. It solves the classic problems:
- Itâs not a bottleneck: It scales to handle terabits per second of traffic.
-
Granular Routing: You can create separate TGW route tables. This means the
devspokes can all talk to each other in their own little sandbox, but they canât touch theprodspokes. This contains the blast radius. - East-West Traffic: It handles spoke-to-spoke communication natively without traffic having to âhairpinâ through a single EC2 instance in the hub.
Hereâs a taste of how simple a Terraform config for this can be:
# main.tf - Assuming TGW is already created
resource "aws_ec2_transit_gateway_vpc_attachment" "webapp_attachment" {
provider = aws.us-east-1
transit_gateway_id = var.transit_gateway_id
vpc_id = var.prod_webapp_vpc_id
subnet_ids = var.prod_webapp_private_subnets
tags = {
Name = "tgw-attach-prod-webapp"
}
}
# You also need to add a route in your VPC's route table
resource "aws_route" "to_shared_services" {
route_table_id = var.prod_webapp_private_route_table_id
destination_cidr_block = "10.10.0.0/16" # CIDR of the shared services VPC
transit_gateway_id = var.transit_gateway_id
}
Solution 3: The âNuclearâ Option â A Multi-Hub or Full Mesh Reality
Sometimes, even a single TGW isnât enough. If youâre a global company, you donât want your traffic from Sydney to hairpin through a TGW in Virginia just to talk to a VPC in the same Sydney region. This is where you graduate to more advanced topologies.
Multi-Hub (Region-to-Region): You deploy a Transit Gateway in each major region (e.g., us-east-1, eu-west-1, ap-southeast-2). Then, you peer these TGWs together. This creates a global backbone for your company. Traffic stays within a region for local communication but can traverse the peering connection for cross-region needs efficiently.
Cloud WAN / Full Mesh: For the top 1% of complexity, services like AWS Cloud WAN build a core network for you and allow you to define segments. You can create a âprodâ segment and a âdevâ segment globally. All VPCs attached to the âprodâ segment can talk to each other (a logical mesh), but they are isolated from âdevâ. This is powerful but requires a dedicated networking team to manage policy, routing, and costs.
Which path is right? A quick comparison:
| Approach | Complexity | Cost | Best For |
|---|---|---|---|
| Direct Peering | Low | Low (Data transfer costs only) | Quickly linking 2-3 specific VPCs. |
| Modern Hub & Spoke (TGW) | Medium | Medium (TGW attachment + processing fees) | Most startups and mid-size companies. The 90% solution. |
| Multi-Hub / Cloud WAN | High | High | Large, multi-region, global enterprises. |
At the end of the day, thereâs no âone true way.â The textbook hub-and-spoke model is a starting point, not a destination. Donât be afraid to deviate when reality calls for it. Just make sure youâre choosing a path to solve a real problem, not just creating a more complicated one for your future self to untangle at 2 AM.
đ Read the original article on TechResolve.blog
â Support my work
If this article helped you, you can buy me a coffee:

Top comments (0)