Key Terms:
Protocol, Transmission Control Protocol (TCP), Internet Protocol (IP), Switching, HyperText Transfer Protocol (HTTP), HTTP request-response cycle, Header, Payload.
As a developing web developer, I want to understand the structural elements of the web. The web does not work exactly like the rest of the world; it is de-centralized/distributed, works at roughly the speed of light, and in its infancy felt like witchcraft. I've taken it for granted for many years, but now learning how it works, it feels like magic all over again. But magical is often synonymous with incomprehensible, so rather than remain mystified, let's break down how this stuff actually works.
I think we'll find that breaking things down into their fundamental elements–don’t worry, I'm not going to talk about binary–we’ll find the architecture of the web to be fairly straightforward. It won't remove the magic from the music, but, I hope, like understanding music theory, will add a background layer of intellectual excitement to the experience of thinking about the structure of our collective human consciousness that is the web.
First, some definitions:
What is Packet Switching?
Packet switching is one mode of transmitting data. In contrast, phone networks use a different means of data transmission called circuit switching. In packet switching, the way in which information is transferred is in small packets of data. Packets make their way across the web, from client to server and vice versa. Usually a packet’s size will be between 1-1.5 kilobytes (1,000-1,500 bytes). By dividing information into these nice little kilobyte-sized portions, the web can eat up our data without choking and gasping for air. Packet switching makes information transfer more reliable.
Here is a helpful gif to illustrate how packet switching works:
What is a protocol?
In telecommunication, a communication protocol is a set of rules, syntax, and other fundamental decisions that standardize and dictate information exchange between two or more parties.
What are the main protocols we should know to understand information transmission on the web?
TCP stands for Transmission Control Protocol. IP stands for Internet Protocol. These protocols work together and are often referred to collectively as TCP/IP.
TCP/IP was constructed to transmit packets and is thus known as a packet switching technology. Packets need a header to specify their origin and destination as well as the actual data they are conveying. If packets are like a letter, where the header is the address that anyone can see on the outside of the envelope, and the data is the letter enclosed in the envelope, then TCP/IP is like the postal service that accepts it from the sender and does its best to get it to the receiver.
TCP/IP has 4 layers (ATIN): Application, Transport, Internet, Network
How do these layers work you ask?
SHORT ANSWER: UHHH NOT SURE. I’ve outlined how I understand these layers below, but some of this might contain errors, especially the transport layer. These layers don’t appear to have a standardized nomenclature, so I see different labels for them in different descriptions.
(Image Source: What is TCP/IP? by TechQuickie )
- Application Layer: HTTP functions at the application layer. For email, SMTP is used instead of HTTP. HTTP and SMTP are both application protocols, hence the name of this layer. If it helps, you can also remember this layer’s name as the one at which your web browser and other APPLICATIONS interact.
- Transport Layer: TCP (and UDP) live at the transport layer. TCP is like the factory that is producing the cars (packets) according to exact specifications so that they are road-ready. It can also recombine packets into a giant super-car (think of a combiner transformer). From the transport layer, the packets then go to the internet layer.
- Internet Layer: here packets are subject to the Internet Protocol (IP), which labels the packets with A) the origin and B) destination IP addresses so that the packets know where they are going. Next is the network layer.
- Network Layer: this layer deals with MAC addressing which gets you to correct the physical machine, as well as conversion of the data into the electrical signal that will pass through the INTERNET plumbing.
Are you ever going to explain the car analogy?
YES. To borrow a chestnut from Al Gore and the 1990s, information exchange on the internet can be imagined as a SUPER HIGHWAY. However, now in 2020 this analogy can actually be helpful if we flesh it out with details. Let's make it more sci-fi and weird, just like the web.
A gigantic mega-car (the whole piece of data being transmitted) is broken down into dozens of little cars (packets) based on conventions laid out in TCP/IP. Each little car is given a header and a payload.† Basically, imagine a giant transformer like Optimus Prime breaking down into its constituent little transformers (to avoid traffic), pointing across LA's urban sprawl, and all saying in unison "We will meet at that In-N-Out Burger."
TCP/IP is also like the white dashed lanes of the highway and the laws governing the road, which dictate how the cars come and go. TCP is congestion averse which means that it prioritizes speed of transportation.
The cars (type=string) pass through the TCP/IP layers until they are zapped with a death ray/teleporter beam (i.e., converted into an electrical signal).
After some distance, the cars are resurrected from the dead, reconstituted into little car packets, and then once again into the giant mega car on the other side (client/server).
The End
Footnotes
†I imagine the header in several ways: as writing that covers the entire front of the car, as the engine which allows the car to go to its destination, as the license plate that gives the car legal recognition and permission to be on the road in the first place.
One can imagine the payload in several ways: the people in the car, the lumber in the back of the truck, or the actual entire back part of the car. In any case, remember that the payload is the information you are actually trying to transmit.
This explanation from Wikipedia is also a helpful way to think about it Wikipedia - Payload (Computing) :
"In computing and telecommunications, the payload is the part of transmitted data that is the actual intended message. Headers and metadata are sent only to enable payload delivery.
In the context of a computer virus or worm, the payload is the portion of the malware which performs malicious action.
The term is borrowed from transportation, where payload refers to the part of the load that pays for transportation."
Sources:
HTTP
- HTTP Zine by Julia Evans
- Podcast Episode: "HTTP with Julia Evans" on "JavaScript - Software Engineering Daily"
TCP/IP
- This YouTube video: What is TCP/IP? by TechQuickie
- Wikipedia – TCP
Top comments (0)