João Godinho

Posted on Mar 9

Cloud Network Components for Availability and Security

#cloud #distributedsystems #infrastructure #aws

Introduction

In this article, I will discuss the main network components for achieving availability and security in the AWS cloud environment. These concepts are generic and can be applied to other cloud infrastructures or even to system design architecture in general, since availability and security depend on factors such as redundancy, multi-region or multi datacenters deployment, and more.

Insights

Talking about network components for availability and security in cloud environments is important not only because most companies use cloud infrastructure today, but also because cloud providers follow strong best practices. They have highly experienced professionals and operate at massive scale. Understanding the level of separation and the components they provide is important to learn how to achieve availability, reliability, performance, and security for our services.

Multi Region

Each region is a separate geographic area.
Each region contains its set of Availability Zones with their datacenters. This reduces latency for users closest to the servers and also improves reliability and availability through redundancy and fault isolation.

Availability Zone (AZ)

Isolated locations within each Region.
Inside a region, an AZ is one or more independent datacenters. If something happens to one AZ, but you have multiple AZs, your service will continue working.

AWS Region and Availability Zones - Image Reference: AWS Documentation

Virtual Private Cloud (VPC)

A virtual network logically isolated virtual network inside AWS, similar to a traditional data center network.
- When you first create your AWS account, a default VPC is created as a logically isolated network where your AWS resources (e.g. EC2 instances, subnets, and more) reside.
You define the IP range with CIDR (e.g. ipv4 10.0.0.0/16 it can also use ipv6).
- The subnets can use ranges within the VPC CIDR block range.
Contains subnets that divide the network into smaller ranges where resources are placed.
Uses route tables associated with subnets to define where traffic is sent based on the destination IP.
Can connect to the internet (Internet Gateway), other VPCs (VPC peering), or on-premise networks (VPN or Direct Connect).
- Subnets use Internet Gateway only if their associated route table has 0.0.0.0/0 → IGW.

Subnets

A range of IP addresses in your VPC and it resides within only one Availability Zone. (One AZ can contain multiple subnets)

Subnet types

Public: Has a route to an Internet Gateway, enabling resources to access the internet.
- Where your frontend website would reside.
Private: No route to an Internet Gateway; requires a NAT device for internet access.
- Where your backend server would be.
- NAT lives into a public subnet.
Isolated: No routes to destinations outside the VPC; resources can only communicate within the VPC.
- Where your database would be.
VPN-only: Route to a Site-to-Site VPN via a Virtual Private Gateway; no route to an Internet Gateway.
- Normally used to connect on-premise services with cloud infrastructure.

VPC and Subnet IP Ranges

When you create a subnet, you specify its IP addresses, depending on the configuration of the VPC it can be: IPv4 only, IPv6 only or Dual-Stack both together.
Highly recommended (Dual-Stack - IPv4 and IPv6): There are users who can only connect via IPv4.
Important When choosing VPC and Subnet CIDR IP Block Ranges: AWS reserves 5 IP addresses in every subnet (first 4 and last 1) for their internal usage.

VPC CIDR block
- AWS minimum: /28 (16 IPs)
- Practical minimum: /22 (1,024 IPs) to allow multiple subnets across AZs.
- Recommended: /16 (65,536 IPs) to leave room for growth.
- The main VPC CIDR cannot be changed later (only new blocks can be added), so choosing a larger range is safer. You cannot be surprised with production issues in the future.
Subnet CIDR block
- AWS minimum: /28 (16 IPs, 11 usable because AWS reserves 5).
- Small workloads: /27 (32 IPs, 27 usable).
- If using load balancers: /26 (64 IPs, 59 usable).
- Recommended: /24 (256 IPs, 251 usable) for most cases.

Route Tables (IP Routing - Layer 3 - Network)

Route tables exist within a VPC and are associated with subnets to control where network traffic is routed.
A route table contains a set of rules called "routes" that specify where network traffic should go based on its destination IP address. Each route consists of:
- Destination: ip address range
- Target: The gateway, network interface, or connection through which to send the traffic (like an internet gateway, NAT gateway, or VPC peering connection).
We can also create custom route tables to:
- Define specific routing rules for different subnets
- Create public subnets (with routes to internet gateways)
- Create private subnets (with routes to NAT gateways)
- Set up VPN-only or isolated subnets
Relationship between route tables and subnets:
- One-to-One: Each subnet can only be associated with one route table at a time
- One-to-Many: A single route table can be associated with multiple subnets
- Default Association: Any subnet not explicitly associated with a custom route table automatically uses the main route table from the vpc.

VPC with route tables - Image Reference: AWS Documentation
Main route table = the default route table of a VPC, automatically used by any subnet not explicitly associated with another route table.

Security for Subnets and VPC

Both security groups and network ACLs are used as security layers that reside within a specific VPC on cloud infrastructure. But they differ between each other.
Example flow: Entry point -> VPC → Subnet → NACL → Instance → Security Group
- Entry point can be: Internet Gateway, Virtual Private Gateway, etc…
- Important: NAT Gateway enables outbound internet access for private subnets and allows return traffic for those outbound connections, but doesn't serve as an entry point for new inbound connections from the internet.

Compare security groups and network ACLs - Image Reference: AWS Documentation

Security Groups

Primary layer of security in the VPC.
Applied to specific compute service instances (EC2, load balancers, RDS databases, etc.)
Stateful - automatically allows return traffic without needing explicit rules
- if inbound traffic is allowed, the response traffic is automatically allowed
- if you send an outbound request from your instance, the response traffic is automatically allowed
Evaluates all rules before deciding whether to allow traffic
Supports allow rules only (implicit deny for everything else)
Can reference other security groups instead of IP addresses (allow traffic from instances using the selected security group)
What can be configured?
- Inbound and outbound rules
- Protocol (TCP, UDP, ICMP)
- Port ranges
- Source/destination (IP addresses, CIDR blocks, or other security groups)

Network ACL

Additional optional layer of security
Applied to the entire subnets not to instances.
Stateless - requires explicit rules for both inbound and outbound traffic (return traffic must be explicitly allowed)
Processes rules in numerical order and applies the first matched rule
Supports both allow and deny rules
What can be configured?
- Inbound and outbound rules with rule numbers
- Protocol (TCP, UDP, ICMP, or all)
- Port ranges
- Source/destination CIDR blocks
- Allow or deny actions

AWS Configured Network - Image Reference: AWS Documentation

Route 53 (DNS - Layer 7 - Application)

A global cloud DNS web service that translates domain names to IP addresses.
Domain Name Server (DNS) - Determines which IP address a domain name resolves to.
- User requests www.amazon.com → Route 53 resolves it to an IP address (e.g., 203.0.113.5)
  - Once the ip is known and and reaches your VPC route tables will determine how to reach that IP addres (through internet gateway, NAT gateway, etc).
  - But this is not work for Route53, I’m just connecting the points here.
With Route 53, we can also configure load balancing at the DNS level, which is extremely important if you have a service running instances in multiple regions.

Possible configurations (DNS load balancing):

Simple: Single record for a domain.
Least Latency: Routes to the region with the lowest user response time.
Failover: Routes to a backup resource when the primary is unhealthy.
- Uses health checks to do that.
- Health checks for secondary are optional but recommended.
Geolocation: Routes based on the user's physical location.
Geoproximity: Routes based on user-to-resource geographic distance.
Weighted Round Robin: Distributes traffic based on assigned percentages.
Multivalue Answer: Returns multiple IP addresses for high availability.

What to consider before choosing one:

Do I need legal compliance within the same region? Geolocation
Do I always want the lowest latency on DNS? Least latency
Do I need performance? Least latency? NO!
- It depends; sometimes choosing the lowest-latency IP can change the IP too often and lose performance gains from ISP DNS caching, CDN or edge caches tied to the same region, and more.
- Consider it, but generally using Simple or Geoproximity is fine for most cases.
Do I need Availability and Reliability over Performance?
- Failover can decrease Performance but increase Availability and Reliability.

Load Balancing in the Cloud

Distributing incoming traffic across multiple targets to ensure high availability and reliability.
- For more about load balancing specifically Click here.
While Route 53 performs DNS-level load balancing, routing users to the nearest region for example, it can also be combined with an Application Load Balancer to distribute traffic across EC2 instances within that region.

In AWS there are:

Application Load Balancer (ALB) - recommended for HTTP/HTTPS traffic, web applications, APIs, and microservices with advanced routing needs
Network Load Balancer (NLB) - recommended for TCP/UDP traffic, extreme performance requirements, and applications needing static IPs

Application Load Balancer (AWS):

Required at least 2 AZs
AWS creates one node in each AZ for reliability and automatic failover
- If one AZ fails, nodes in other AZs continue handling traffic

To configure an ALB:

1. Create ALB - Select region and VPC where ALB will reside

2. Choose 2+ subnets (AZs) - Select public subnets for internet-facing or private for internal load balancers

3. Attach Security Group - Define firewall rules (e.g., allow inbound port 80/443 from internet)

4. Create Target Group:

Choose target type (EC2 instances, IPs, or Lambda)
Set protocol (HTTP/HTTPS) and port
Configure health checks (path like /health, interval, timeout, thresholds)
Register your targets (add EC2 instances or IP addresses)
Define routing algorithm (round robin, least outstanding requests...)

5. Configure Listener - Listeners define how ALB processes the traffic → “when traffic arrives at port 443, forward it to target group X”

Best practice:

In your EC2 instances allow only access from the ALB's security group.
So your EC2 instances will reject direct internet traffic.

General Logic for Multiple Instances Services

When running high-traffic applications with horizontal scaling across multiple instances in multi-AZ or multi-region deployments, avoid duplicating logic across instances. Instead, centralize common functionality.
You want to centralize the following logic:
- Authentication and Authorization: Validate users before requests reach instances
- Rate Limiting: Control request rates per client/API key
- Request/Response Transformation: Modify payloads without changing application code
- Caching: Reduce backend load (when applicable - not for all use cases)
- Monitoring and Analytics: Track API usage and performance

Some architectural options:

API Gateway → ALB → Instances
- For APIs requiring advanced features: rate limiting, request/response transformation, API versioning, caching, and granular throttling
ALB with Built-in Auth → Instances
- For simpler applications that only require authentication. (Lower cost and complexity)
ALB → Instances
- For even simpler applications, authentication and other logic can be inside each instance's code.

Cloud Nomenclatures Comparison

Cloud Comparison: Cheetsheet for AWS, Azure, and GCP - Image Reference: Sogeti.

Free vs Charged Resources (AWS)

Free
- VPC creation
- Subnets
- Route tables
- Security groups
- Network ACLs
- Internet gateway
- Elastic Network Interfaces (when within the instance limit)
Charged
- Public IPv4 addresses
- Compute resources (EC2, RDS, etc.)
- Load balancers
- NAT Gateway (hourly + data processed)
- DNS (Route 53 hosted zones, queries, health checks)
- VPN connections
- Direct Connect
- Amazon API Gateway (charged per request + data transfer)
- CloudFront: Data transfer out, HTTP/HTTPS requests (after free tier)
  - The only service not mentioned here, it is a Content Delivery Network (CDN), used to edge cache responses closer to users.
- Data transfer (internet outbound, cross-AZ, cross-region)
  - Cross-AZ: RDS us-east-1a → EC2 us-east-1b
  - Cross Region: RDS us-east-1 → EC2 ap-southeast-1
  - Important: (free: internet→aws) and (charged: aws → internet)
Important:
- Some of the listed paid services include a free tier, check the AWS website for details.

DEV Community