AWS Networking Fundamentals

#aws #codenewbie

PS: The post was originally posted on my weekly AWS newsletter - AWSMAG. If you wish to receive more like these every week, join the newsletter.

When I started using AWS as a developer, I was bombarded with the lots of jargon about VPC, subnet, CIDR ranges and all those words, I was not able to remember in the beginning. If this is something happened to you as well, you are in the right place my friend. Let's understand the fundamentals of AWS networking in this blog post.

First of all, we will be using some words like Region, Availability zones etc. in this and if you are not familiar with these words, you can read my other post about AWS Global Architecture. Once you have an idea what these are, we can move ahead with other jargons.

So, what are we going to talk about here. Following is the List:

VPC(Virtual private cloud)
Subnet
Security Groups
Internet Gateway
NAT Gateway
CIDR Range

Following Image will give you an idea how all this fits in.

What is VPC (Virtual Private Cloud)?

VPC or Virtual Private Cloud is a logical isolation in AWS cloud which is defined by you for creating your own
infrastructure in it. Consider this as your own space.
You can create your own network and configure it in they way you like it to work. All the resources are deployed in it and they are isolated from the resources deployed in any other VPC. From a hierarchical point of view, you have a Region and you will create a VPC in it which will hold all of your resources. By default, every Region comes up with a default VPC which is a good starting point and you can use it if you don't want to get your hands dirty. My advice will be to get your hands dirty and create a network which suits your requirements. Default VPC is already configured with the following:

A CIDR Range
Subnets to access private and public network
Internet and NAT Gateway
Security Groups

If we want to create our own VPC, we will need to understand and create all the things mentioned above. so lets try to understand them first.

What is a CIDR Range?

A CIDR(Classless Inter-Domain Routing) range is a group of IP addresses which you can use in your network. As it is a private network, we use IP Address from RFC1918 standard. These addresses are not used over internet, so it is safe to use them internal even though when we connect our network over internet. Also, one thing we should keep in mind is
if we are connecting to different VPC, we should take care that both of them are not using same private address range. Otherwise we will have issues while connecting them. Let's look at a CIDR Range.

The first part is network, second part defines the host and the part after / tells you how many addresses can be utilised. This will have 65,500 addresses approx.

Usually applications are deployed in different Availability Zones to maintain high availability and have redundancy.
This allows us to handle any failover situations. To reduce the blast radius and to use of this offering by AWS, we divide CIDR range in equal parts to all availability zones.
For eg: -

AZ-a will have 172.31.0.0/24 - this means it has approx 250 address.
AZ-a will have 172.31.1.0/24 - this means it has approx 250 address.
AZ-a will have 172.31.2.0/24 - this means it has approx 250 address.

After the division you will still have addresses left for your rest of the structure. Now we have created and divided the addresses, how will they talk to each other.
That is where subnet comes in.

What is a Subnet?

The division we mentioned above i.e. the division of the Ip addresses in multiple AZ's is actually using subnet. Subnet is a logical group of Ip addresses
that is a subsection of the wider network which we talked earlier. Subnets are of two types and here they are:-

Private Subnet: A private subnet is a group of addresses which you don't want anyone can access from outside the VPC. We usually use private subnet for things like database. We want to keep it out of the hands of anything which can access over the internet. But, any application deployed in our network should be able to access it. So any application you have deployed internally which needs to process data stored in your database should be able to access it. Now, when I came to know about this in my early days, I asked a question. What if I want to apply a patch on my database system which is released on internet? If you also have that question in your mind, well the answer is NAT Gateway. and we will talk about them after some time.
Public Subnet: Yes, you guessed it right, A public subnet is a logic group of addresses which we want to access via internet. These are a good example for your web servers. If you are hosting an application which is web based and you want to access it, we will assign them in our public subnet and they should be able to get request and also call things over internet with the help of an Internet Gateway. We will talk about it in a moment.

Each Subnet is assigned a route table which will tell you what it can access. The route table holds rules which allow them to access VPC resources or internet via an
Internet Gateway. You can provide CIDR ranges in here to define the flow of the network. something like 0.0.0.0/0 means all traffic goes to that ip address range. We often use
this to setup Internet Gateway.

Subnets are a very important piece in your architecture. They tell you what type of network access an instance has and in which Availability Zone the instance lies in.
They also have a feature called Network Access Control List(ACLs) which is a security feature and tells you which IP addresses and ports are allowed to flow traffic
in and out in the subnet.

Ok, you made it so far. Do you need a cup of coffee before we move ahead? if Yes take a break. If no, let's tackle three big words we came across while discussing Subnets. Let's Start with Internet Gateway.

What is Internet Gateway?

Internet Gateway is a component which provides internet accessibility to your VPC. Why do we need it? Because we want someone from the internet should be able to access our resources. Like I mentioned above, this is a usual case for a web server. If you have a website hosted you want people to access it over internet.
They can only do it when you allow it on your network. Getting an Internet Gateway does not solve your problem. You also need to connect your public subnet to this gateway by adding an entry in the route table like 0.0.0.0/0. This allows all the traffic to flow from the Internet Gateway providing you the access over internet.

What is NAT Gateway?

NAT Gateway is managed service by AWS which lets you connect to the internet. Now, you will be like, Dude!! then what the hell Internet Gateway is used for? Well they both provide access to internet but NAT gateway is used with your private subnet and technically goes via your internet gateway. Any instance in the private subnet should be able to go to internet to access any patches or downloading dependencies.
Someone on the internet should not be able to talk to our private instances. Thats where NAT gateway helps us. Initially we used to deploy NAT instances to achieve the same functionality.

We discussed the whole networking and getting our infra right. Deploying our instances on different AZ for better availability and redundancy and everything. The most important piece in all this is security. How will we maintain security across all this? The answer is Security Groups.

What is a Security Group?

Security Groups are virtual firewall around our instances. Do not get confuse with Network Access List which we talked about in Subnets. ACL's are at subnet level and control the traffic for all the instances in that subnet. Security groups are at instance level and controls the flow of data at instance level. One example will be, you can create a security group for your database service and application service.
Database service should be able to access database in the private subnet on a particular port but this flexibility will not be allowed in the application service security group.
Any application service under the security group will have access to database service but not to database directly. So they have to go via the service to talk to database. Another example can be to not allow ssh access to your instance on any port to allow it for one IP range only which will be your enterprise range.

That's all the fundamental jargon you should know if you are working with the AWS and developing application deployed on it. Understanding these is also important as the defined rules and constraints over the network also affect the decisions we take while developing the applications. Else we will be in a position when some people say, It works on my local, but blows up when I deploy.
Don't be like that.

Have a nice cup of coffee and digest all these. In future, I will write about some advance things related to the AWS networking.