It seems the deeper we go according to the OSI classification, the less one deals with this topic. If you are developing applications that communicate over the network, it would be nice to understand how things actually work, right? Let us take a look at how today's computers work and the role that sockets, ports, and processes play.
As an example, let us take a normal computer that has an operating system (OS). In the OS there is a part called "network stack" which takes care of network communication. In it there is an array of port numbers from 0
to 65535
, where 0-1023
are known ports and 1024-65535
are treated as random ports. On the OS there is a part (I call it stage) where processes are handled - i.e. software or applications. When a process communicates over a network stack, the process is called a service. In this way, when we communicate, we define exactly what kind of process it is.
Let’s take a look at how communication between processes works. The diagram below shows two processes, one of which is a server and the other a client. To make it as understandable as possible, both processes (services) run on the same computer.
┌──────────────────────────────────────────────────┐────────────┐
│ ┌────────────┐ │ │
│ │ ... │ │ │
│ ├────────────┤ ┌──────────────────────────┐ │ ┌────────┐ │
│ ┌─▶│ PORT=8080 │◀─▶│ SOCK=10.0.1.10:8080:TCP │◀┼▶│ SERVER │ │
│ │ ├────────────┤ └──────────────────────────┘ │ └────────┘ │
│ │ │ ... │ │ │
│ │ ├────────────┤ ┌──────────────────────────┐ │ ┌────────┐ │
│ └─▶│ PORT=50000 │◀─▶│ SOCK=10.0.0.10:50000:TCP │◀┼▶│ CLIENT │ │
│ ├────────────┤ └──────────────────────────┘ │ └────────┘ │
│ │ ... │ │ │
│ └────────────┘ │ │
├──────────────────────────────────────────────────├────────────┤
│ Network Stack │ Stage │
└──────────────────────────────────────────────────└────────────┘
│ Operating System │
└───────────────────────────────────────────────────────────────┘
For a process to communicate over a network with another local process or a process on another computer, it must first ask the OS to create a socket. A socket represents the communication interface between the process and the OS's network stack. It contains information about IP
, port
and communication protocol
(e.g. TCP or UDP).
The OS thus creates a socket and the process gets a handle to the network stack. In order for the process to actually communicate with other services, it must send a "bind" request to the operating system, which binds the socket to a specific port number. This means that messages are sent over this port and incoming messages from other services that communicate with this service are also sent over the same port.
This concept applies to both the server and the client. This is the part where application developers get confused, because higher level programming, especially over TCP
, hides all these things and people feel that bind
is synonymous with the function listen
and is exclusively for a server listening and waiting for incoming connections. So the client also needs a socket bound to a port to communicate with the server. The server service usually defines the known port over which it wants to communicate. The client can use a feature of the operating system that is intelligent enough to automatically bind the client to one of the available random ports when it sends the message for the first time, so that the client does not have to perform this step explicitly.
Now that the server and client are connected to the network stack, both can send a message since the path from server to client is known. Such a unique communication path is also called a 5-tuple.
┌─────────────┬─────────────┬─────────────┬─────────────┬─────────────┐
│ Local IP │ Remote IP │ Local Port │ Remote Port │ Protocol │
│ 10.0.0.10 │ 10.0.0.10 │ 8080 │ 50000 │ TCP │
└─────────────┴─────────────┴─────────────┴─────────────┴─────────────┘
The TCP
and UDP
protocol data about the sender and the receiver are appended to the beginning of the header of each message. TCP and UDP both belong to the OSI layer 4 transport protocols. They are similar and at the same time very different.
TCP is a robust and proven protocol that drives most online communication today. It is reliable and we can trust that packets sent will always arrive at their destination. UDP, on the other hand, is very lightweight, faster than TCP, but unreliable and sent packets can be lost without the sender being informed. TCP is connection-oriented, meaning that the protocol establishes a stateful connection between two endpoints before it can start sending messages. UDP is connectionless, meaning that it simply starts sending messages and is not interested in details.
TCP is an ordered data transmission and therefore determines the order in which packets sent on the other end must be processed. With UDP, the packets can be sent in the wrong order. TCP has many other features like "error recovery" and "flow control" and hence is popular for HTTP communication. UDP, on the other hand, is mainly used for video streaming or similar cases where it does not matter if a packet is lost.
The main argument for using TCP is that we can trust it to deliver the packet to the designated location. However, UDP can also be given similar properties with some additional programming. An example of this is QUIC, the protocol we associate with the upcoming HTTP/3, which is otherwise intended for general use.
Top comments (0)