AWS network infrastructure can become complex.
There is also some overlaps between different technologies which makes it harder to understand what does what.
Some of the questions I hear the most are about the intersection between AWS VPC, Elastic Network interfaces, Elastic IP, and the closely related Security Group and ACL. This is my attempt to clarify in practical terms the difference between these technologies.
If you are eager to look at the cheat-sheet, scroll down to the bottom.
In the early days of AWS, when EC2 lunched all the instances from all the customers could communicate with each other. If Alice launched an instance in her account and Bob launched another instance is his account, the two instances were able to communicate with each other out of the box.
To prevent that, Alice (or Bob) were required to set a security group, allowing only traffic from selected ips or from instances launched with the same security group.
The problem with security group was that was very easy to mess it up, thus allowing other AWS accounts to breach the instances. This mode of operations is called EC2-classic to distinguish it from the new way in which EC2 instances are provisioned.
In this new mode, EC2 instances are always launched in a VPC (called default VPC) which has a default security group. Instances in different VPCs do not communicate with each other, unless they are peered.
Of course you can also create your own VPC and launch instances in there.
Today EC2-classic only exist for legacy reasons which tie into the challenge of taking back a service once that has been launched.
Having covered the history, what's a VPC? It's an isolated portion of the AWS cloud. Instances launched in a VPC cannot be accessed from other VPCs in the same account or other AWS accounts.
It's main property is the CIDR block, which is the range of IPv4 addresses assigned to instances launched in the VPC.
Leaving aside Security group and ACL (which are discussed later), all the instances created in the same VPC can communicate with each other, like in a private LAN.
After creating a VPC, you can add additional subnets by dividing the range of IP addresses assigned to the VPC.
A security group acts as a virtual firewall that controls the traffic from/to a network interface.
When you launch an instance, you get a default network interface (eth0).
An EC2 instance can have multiple network interfaces and each network interface can have a different security group.
Later I'll discuss how to attach additional network interfaces to an instance.
Security Groups are attached to a network interface, not an instance. When you start an instance, it receives a default network interface (eth0). The security group will be attached to that default network interface.
Don't get fooled, every time you specify a security group for an AWS service, behind there is a network interface.
Let's take for example AWS Elastic File System. When you create a File System, you are going to specify security groups. Does that mean that the security group control the access to the File System? Well, in some sense yes. But technically the security group are going to be attached to a Elastic Network Interface (ENI) that EFS creates on your behalf.
I am going to discuss about ENIs later, the takeaway here is: Security groups are associated to Network Interfaces (Elastic or not) and they act as a virtual firewall that controls the traffic from/to a network interfaces.
Subnets in the VPC can have a default security group. Does this mean that security groups can be attached to network interfaces as well as subnets? Mmm, not really. By defining a default security group for a subnet, you have informed EC2 that the next time you launch an instance in that subnet, it needs to attach that security group to the network interface of the instance. That is, unless you override the security group in the launch instance api call.
Subnets also feature a property called: "auto-assign public IPv4 address" which means that new network interfaces in that subnet automatically receives also a public IPv4 or IPv6 address (in addition to the private IP address). As for security groups, this setting defines the default value that EC2 is going to use when launching an instance in that subnet. For example, let's assume we start an instance with the following command:
aws ec2 run-instances \ --subnet-id subnet-10000000 \ --image-id resolve:ssm:/aws/service/ami-amazon-linux-latest/amzn2-ami-hvm-x86_64-gp2
The instance launched will have an auto assigned ip address if
subnet-10000000 has the "auto-assign public IPv4 address" setting set to true. But, like I mentioned this is just a default and we can override this behavior by using the
--associate-public-ip-address or the
--no-associate-public-ip-address flags. For example if we run the following command:
aws ec2 run-instances \ --subnet-id subnet-10000000 \ --image-id resolve:ssm:/aws/service/ami-amazon-linux-latest/amzn2-ami-hvm-x86_64-gp2 \ --associate-public-ip-address
the instance will get a public IP regardless of whether we have set the "auto-assign public IPv4 address" on the subnet
After creating your VPC, you divide it into subnets. In an AWS VPC, subnets are not isolation boundaries. Rather, they are containers for routing policies. Isolation between subnets is achieved by attaching a Security Group to the EC2 instances.
For example let's assume we have two subnets in the VPC: SubnetA is 10.0.0.0/25 and SubnetB is 10.0.0.128/25. To isolate them, create a security group for SGSubnetA which allows all traffic for the CIDR 10.0.0.0/25 and another SGSubnetB that allows traffic for the CIDR 10.0.0.128/25.
Attach the SGSubnetA to the instances launched in SubnetA and SGSubnetB for the one in SubnetB.
Result: Only the instances in the subnet can talk to each other, but instances from SubnetA cannot communicate with instances from SubnetB and vice versa.
A network access control list (ACL) is a network firewall. With security groups you can control what goes in and out your instances, and with VPC ACL you can control what goes in and out of your VPC.
An Elastic IP address (EIP) is a static IPv4 address provided by AWS.
You associate an EIP to a network interface. The documentation is a bit confusing on this point because it says you can attach an EIP to a running instance or to ENI. What it does not say is that by attaching the EIP to the instance, you are actually attaching it to the default network interface of the instance (eth0).
You can attach additional network interfaces to an instance, by attaching to it additional ENIs.
Why do you even need an EIP? Are they different from public IPs?
Public IP addresses are dynamic, their value is lost once the EC2 instance using it is stopped or terminated.
Elastic IPs are static, they survive instance stop/start cycles.
An Elastic Network interface (ENI) is the network stack of the instance.
Each ENI lives within a particular subnet of the VPC and has the following attributes (not exhaustive list):
- Private IP Address
- Public IP Address
- Elastic IP Address
- MAC address
- Security Group(s)
A very important consequence of this new model is that the idea of launching an EC2 instance in a particular VPC subnet is effectively obsolete. A single EC2 instance can be attached to two ENIs, each one on a distinct subnet. The ENI (not the instance) is now associated with a subnet.
This enable two powerful use cases:
Assuming we have created two EC2 instances (Instance-A and Instance-B), an ENI and a public EIP. The EIP is attached to the ENI and the ENI is attached to instance-A. Both EC2 instances run a Web Server.
Let's visualize that:
In the image, the Client sends an http request to the EIP. EC2-A will respond with an html page. Now if we move the ENI from EC2-A to EC2-B (detach from EC2-A and attach to EC2-B), the next request coming from the client will be server by EC2-B.
This is a fundamental building block which allows to build Primary-Failover strategy.
Note: I do not want to muddy the water, so I won’t go into other Primary-Failover solutions. Just for you to know, ENI switching is just one of the possible solution to build Primary-Failover mechanic in a system - Load Balancers or consensus groups are other options.
You can also use it to build dual-homed environment for your web, application, and database servers. The instance’s first ENI would be attached to a public subnet, routing 0.0.0.0/0 (all traffic) to the VPC’s Internet Gateway. The instance’s second ENI would be attached to a private subnet, with 0.0.0.0 routed to the VPN Gateway connected to your corporate network. You would use the private network for SSH access, management, logging, and so forth. You can apply different security groups to each ENI so that traffic port 80 is allowed through the first ENI, and traffic from the private subnet on port 22 is allowed through the second ENI.
Note: When creating a subnet, you can specify some default options that are going to be applied to all the ENIs that are associated to that subnet. Most notably, you can choose to give a public ip to all the ENIs associated to that subnet.
VPC component that allows communication between instances in your VPC and the Internet.
For your instances to connect to the Internet or be reached from the Internet, the VPC needs to have a Internet Gateway. Bear in mind that is a necessary component, but it's not sufficient, which means you also need something else to enable internet connectivity. This is discussed in the sections "Connect from EC2 to the Internet" and "Connect from Internet to EC2".
A route table defines how the traffic flow from one subnet to another.
A Route table is created automatically when you create a VPC and contains only one rule which allows all the instances in the VPC to communicate between each other (the rule is called "from subnet CIDR to local").
The rule cannot be modified or deleted, which means that instances in the same VPC can always communicate with each other and you can't restrict that using a Route Table. You need to use security groups for that: see the "Subnets Isolation" section.
Private and public are an emergent property of a subnet, which means a subnet becomes public or private depending on how the subnet is setup. They identify a concept, not actually entities in the AWS console.
If a subnet's traffic is routed to an Internet gateway, the subnet is known as a public subnet.
If a subnet doesn't have a route to the Internet gateway, the subnet is known as a private subnet.
To connect to the Internet, you have two options:
Public Subnet. Create an Internet Gateway in the VPC AND assign a public address (or public EIP) to the EC2 machine network interface. This same configuration allows also traffic from the Internet to EC2. The Route table assigned to the network needs to be set to forward internet traffic (0.0.0.0/0) to the Internet Gateway
Private Subnet. Create two subnets (SubPublic and SubPrivate) and an Internet Gateway. Create a NAT Gateway in SubPublic. Create two Route Tables, one for SubPublic and one for SubPrivate. In the RouteTable for SubPublic, define a rule that forwards all the Internet traffic (0.0.0.0/0) to the Internet Gateway. In the SubPrivate define a rule which forwards Internet traffic (0.0.0.0/0) to the NAT Gateway
Before I tell you which one I would recommend, let me digress a bit.
If you have followed along you might be asking yourself some common questions.
Regarding the first option, a common question is: why isn't the Internet Gateway enough to connect to the Internet? Why does the EC2 instance need a public address?
Well the answer is that the Internet Gateway does not perform a full NAT (like the router we have home), it only performs 1 to 1 NAT to public IPs that are mapped to instances. This means the calls cannot exit the Internet gateway with a private address because the gateway would not know how to map the answer to the current instance on the way back.
Practically speaking, it does not do NAT.
Regarding the second option, one question is why we need two Subnets, two Route Tables, etc.
One could imagine, having one subnet and create a NAT into it. Create an Internet Gateway for the VPC and that would be all. And the answer is that you can't do that and I am not really sure why. You need to have two subnets.
Looks like a design decision AWS made.
Which approach should we choose?
With the first approach (public subnet) the downside is that not only we have created a channel from EC2 instances to the Internet, but also the other way around. This means anyone on the Internet could try to access your EC2 instances.
Technically there is nothing stopping you using this approach, after all you can use a security group that denies inbound connection. However by using a private subnet, even if you accidentally opened up the security group, there is no way for external access into that subnet from outside AWS directly, so its an extra layer of protection.
This last point makes Private Subnet my default approach.
To reach a machine from the Internet, you need an Internet Gateway and a Public IP (or a public EIP attached to the instance). This is the Public Subnet approach we have seen in the previous section.
A VPC endpoint allows instances in a VPC to communicate to supported AWS services (S3, Dynamo, etc.) without an Internet gateway or NAT gateway. Use case: The EC2 instances need to have access to other AWS services, but you do not want to allow any other outbound traffic to the Internet.
You could obtain the same result by attaching a security group that only allows outbound connections to the AWS services, but that comes with the risk of accidentally changing the security group in a way that allows access to the Internet. VPC endpoint solve this problem.
Forget SSH, use AWS SSM to connect to the instance. Or if you want to use SSH, use SSM over SSH.
Managing loads of SSH keys or bastion host is something that will soon be forgotten.
If the EC2 instances do not need to be customer facing (private subnet), consider using a VPC endpoint for SSM, which allows to connect to instances while remain completely inaccessible from anyone on the Internet.
AWS SSM comes preinstalled on Amazon Linux 2 instances - that is you can start using it out of the box to connect to your EC2 instances. Anyway, it's very easy to install SSM on the instance in the user data as part of the instance bootstrap.
I have covered a big chuck of AWS Networking, but not all of it. Some things you might want to read next are: VPN, Transit gateways, VPC peering, Private Links, Route53 Private Hosted Zones, etc.
|VPC||An isolated portion of the AWS cloud network|
|Security Group||Security group are associated to Network Interfaces (Elastic or not) and they act as a virtual firewall that controls the traffic from/to a network interface|
|Subnets Isolation||Subnets are not isolation boundaries. Rather, they are containers for routing policies. To isolate one subnet from another, attach a security group to the instances launched in each subnet|
|ACL||ACL is a network firewall. With security groups you can control what goes in and out your instances, and with VPC ACL you can control what goes in and out of your VPC|
|EIP||A static IPv4 address (it can private or public)|
|ENI||A network interface that can be attached to an EC2 instance|
|Internet Gateway||A VPC component which allows instances in the VPC to connect to the Internet. It's necessary but not sufficient|
|Public Subnet||A subnet which traffic is routed to an Internet Gateway|
|Private Subnet||A subnet that doesn't have a route to the Internet Gateway|
|Connect from EC2 to the Internet||To connect to the Internet, the EC2 instance needs to be either:
- In a VPC with an Internet Gateway and have a Public address (or public EIP) attached to its network interface
- In a VPC with an Internet Gateway and a NAT gateway. The Route Table needs to have a rule that forward traffic (from the EC2 instances and directed to Internet traffic) to the NAT gateway
|Connect from Internet to EC2||To connect from the Internet to an EC2 instance, you need to have a VPC with an Internet Gateway and have a Public address (or public EIP) attached to its network interface|
|VPC Endpoints||A VPC endpoint allows instances in a VPC to communicate to supported AWS services (S3, Dynamo, etc.) without an Internet gateway or NAT gateway|
|Get a Shell in EC2||SSM or SSM + VPC Endpoint|