DEV Community

Chris White
Chris White

Posted on • Updated on

Python Networking: TCP and UDP

In the last installment we looked at IP headers. One thing you might have seen missing is the port numbers. That's pretty important for making an internet connection. Well it turns out that IP tends to encapsulate other protocols (which is why it has protocol as part of the header). In this article we'll be looking at two popular protocols for internet traffic: TCP and UDP. Before we begin though install scapy which we'll be using to make things easier, and dnslib that will be used during the UDP section:

$ pip install scapy dnslib

Please note if you haven't read the previous 2 installments and are new to these concepts I highly suggest reading them. They contain foundational knowledge which will be utilized in this post.

TCP Flow

TCP or Transmission Control Protocol enables robustness in network communication. Thinking about a client sending a server data, there's no guarantee that the server is alive or the packet the server received is actually from the client. To visualize this easier I'll be using Wireshark, a graphical interface around packet sniffing. After installing, running, and configuring Wireshark I make a call out to google.com:

$ curl www.google.com

When I do so Wireshark is populated with quite a decent amount of data that I'll be dissecting here. So first off is the first four parts of the communication:

59468 → 80 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM TSval=554354431 TSecr=0 WS=128
80 → 59468 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1412 SACK_PERM TSval=2112863826 TSecr=554354431 WS=256
59468 → 80 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=554354458 TSecr=2112863826
HTTP    144 GET / HTTP/1.1 
Enter fullscreen mode Exit fullscreen mode

So before any HTTP call is being made there's 3 separate transactions happening. The call dump shows them as:

  • SYN
  • SYN, ACK
  • ACK

Think of this like someone knocking at the door:

  • Knock at the door (SYN)
  • Who is it? (SYN, ACK)
  • Package delivery (ACK)

Assuming you were expecting the package you'd open the door and the data transaction (accepting the package) would begin. Now the delivery person doesn't just stand there. They'll go on with other deliveries once they've dropped off all eligible packages. So another communication sequence is happening:

59468 → 80 [FIN, ACK] Seq=79 Ack=20198 Win=72320 Len=0 TSval=554354561 TSecr=2112863930
80 → 59468 [FIN, ACK] Seq=20198 Ack=80 Win=65536 Len=0 TSval=2112863955 TSecr=554354561
59468 → 80 [ACK] Seq=80 Ack=20199 Win=72320 Len=0 TSval=554354587 TSecr=2112863955
Enter fullscreen mode Exit fullscreen mode
  • Signature for package(s) (FIN, ACK)
  • Delivery person confirms signature and thanks you (FIN, ACK)
  • You close the door (ACK)

The SYN/SYN+ACK/ACK chain serves the purpose of making sure that the other end of the connection can be communicated with. SYN+ACK tells the client the server can respond and ACK tells the server the client can respond. If after a set time there is no response to any of the calls then the connection is dropped as a timeout. This system also leads to an attack vector known as a SYN flood. In this attack a malicious client sends SYN packets without responding to the SYN+ACK from the server. The server is using system resources to hold the in progress request. Enough of this and the server can become unable to process legitimate traffic.

The FIN communication is of a similar nature where it's letting the server know that all data has been received. This system is in place mainly because it's fairly common that responses to data are sent in multiple chunks. It also lets the server know that it doesn't need to resend any packets as one of TCP's robustness features the ability to retry data that might have been lost due to bad connectivity.

TCP Header Layout

Much like IP the TCP protocol is dictated by a standard, RFC9293. Wikipedia has a nice visual layout of how a TCP header looks:

Visual view of a TCP header

The layout is not far off from the IP header in terms of size. Data offset works much like the IHL of an IP header and is also a minimum of five 32 bit values with optional 32 bit options. As IP headers handle most of that there's not really a lot regarding traffic flow management.

TCP Header Parsing

Now it's time to actually work through the TCP header with python, and in particular struct.unpack. Looking over the header we have:

  • H: 16 bit source port
  • H: 16 bit destination port
  • L: 32 bit sequence number
  • L: 32 bit acknowledgement number
  • B: 4 bit data offset + padding
  • B: 8 bit collection of flags
  • H: 16 bit window size
  • H: 16 bit checksum
  • H: 16 bit urgent pointer

This is of course assuming the optional 32 bit options are absent. I've brought in a packet this time which has an IHL of 5 so this means the header is, like the minimum IP header, 20 bytes. I'll provide the packet data I use so you can follow along easier:

>>> packet = b'E\x02\x01\xd8\x9b\xd9@\x00\x80\x06\x00\x00\xac\x12\x80\x01\xac\x12\x80\x01\xff\xfd"\xb8\x91S\xaf3\xc5\x10\xd8_P\x18\x04\xff\xac\xc9\x00\x00GET / HTTP/1.1\r\nHost: 172.18.128.1:8888\r\nConnection: keep-alive\r\nUpgrade-Insecure-Requests: 1\r\nUser-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7\r\nAccept-Encoding: gzip, deflate\r\nAccept-Language: en-US,en;q=0.9\r\n\r\n'
>>> ip_header = struct.unpack('!BBHHHBBH4s4s', packet[0:20])
Enter fullscreen mode Exit fullscreen mode

Now time for header parsing:

>>> tcp_header = struct.unpack('!HHLLBBHHH', packet[20:40])
>>> tcp_header
(65533, 8888, 2438180659, 3306215519, 80, 24, 1279, 44233, 0)
Enter fullscreen mode Exit fullscreen mode

Now the first 4 values can simply be obtained as is. The source port here is 65533 is a local client port. 8888 is the destination port. Given this flow it can be reasoned that this is a packet received by a server. Sequence number tells the server "here is where this packet fits in the total chain of data I'm going to send" and Acknowledgement Number both lets the server know you received a previous set of data and that you're ready for the next set. Data offset works much like the IHL in the IP header. It's the number of 32 bit words that make up the header. As it's the first 4 bits a right shift can get us the value easily:

>>> tcp_header[4] >> 4
5
Enter fullscreen mode Exit fullscreen mode

The value is 5 here so there are no TCP options (though I will discuss them a bit later). Next is a series of flags which are 1 bit values:

>>> format(tcp_header[5], '08b')
'00011000'
>>> cwr, ece, urg, ack, psh, rst, syn, fin = list(format(tcp_header[5], '08b'))
>>> cwr, ece, urg, ack, psh, rst, syn, fin
   ('0', '0', '0', '1', '1', '0', '0', '0')
Enter fullscreen mode Exit fullscreen mode

I've altered the output a bit to make this slightly easier to read. Given that there's a decent amount going on in these flags, I'll touch on them in more detail in a later section. For right now ACK and PSH are set which indicate that data is being sent and tells the recipient to send any buffered data. Window size is the size in bytes (technically window size units, but this is bytes in modern day usage) the recipient can receive which in this case:

>>> tcp_header[6]
1279
Enter fullscreen mode Exit fullscreen mode

Is 1279 bytes. Note that due to how fast networks have gotten the maximum value of 65536 bytes (65kb-ish) may seem a bit on the small size. To work around this one of the TCP options allows for Window Scaling. This will tell the recipient to left shift the window size value a certain amount (which increases which power of 2 values land on). This option is set only during the three way handshake and is ignored in all other cases. Next is the checksum:

>>> tcp_header[7]
44233
Enter fullscreen mode Exit fullscreen mode

As it's a bit involved I'll be explaining this in a dedicated section. Just now for now that it's meant to ensure the data is not corrupted. Finally, the urgent pointer which is only valid if the URG flag is set (which it's not). This means the value is essentially 0 padding:

>>> format(tcp_header[8], '08b')
'00000000'
Enter fullscreen mode Exit fullscreen mode

Now the way TCP options work is rather interesting. I went ahead and parsed a packet that had TCP options (data offset 8). So there are 3 TCP option segments at 4 bytes, meaning 12 bytes from the TCP header's minimum size of 20 bytes:

>>> tcp_options = packet[40:52] 
Enter fullscreen mode Exit fullscreen mode

Now the way this works is you read in a single byte, which is the Option-Kind:

>>> tcp_options[0]
1
Enter fullscreen mode Exit fullscreen mode

Now looking over the table:

TCP options table

It's a non-operation with no Option-Length and Option-Data. So we move along to the next byte:

>>> tcp_options[1]
1
Enter fullscreen mode Exit fullscreen mode

Same thing here, so on to the next byte:

>>> tcp_options[2]
8
Enter fullscreen mode Exit fullscreen mode

This indicates a Timestamp option according to the table, where the next byte is Option-Length. There are also two 4 byte values containing timestamps:

>>> option_length, timestamp1, timestamp2 = struct.unpack('!BLL', tcp_options[3:])
>>> option_length, timestamp1, timestamp2
(10, 3008363081, 3008363081)
Enter fullscreen mode Exit fullscreen mode

Now the two 1 byte NOOPs + the Option-Kind and Option-Length came out to 32 bits, and both timestamps each take up 32 bits. Essentially the NOOPs acted as padding so everything fit cleanly into three 32 bit segments. If you want to learn more about how time timestamps work (not actually unix timestamps as you may think) check out this article.

Now that the options are explained we'll go back to the packet with no options and get the data, which starts at 40 bytes:

>>> print(packet[40:].decode('utf-8'))
GET / HTTP/1.1
Host: 172.18.128.1:8888
Connection: keep-alive
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7
Accept-Encoding: gzip, deflate
Accept-Language: en-US,en;q=0.9


>>>
Enter fullscreen mode Exit fullscreen mode

So this is a request to a quick python HTTP server I put together for the root page. In order to print it nicely I've decoded the bytes as UTF-8 so they print out cleanly. I'll be going over the HTTP protocol in more detail in a later article.

TCP Flags

Now we'll look at the TCP flags in a bit more detail.

Network Congestion

CWR and ECE flags deal with the network congestion feature of TCP. Essentially when there's network congestion the normal way to handle things was to drop packets. The congestion feature instead would let networking equipment mark packets as congested. Senders would then reduce their transmission rate to help reduce their effect on the congestion. The CWR and ECE flags establish whether or not both sides support this feature and if congestion is occurring. When the feature was originally as RFC 3168 in 2001 some outdated or faulty hardware would see the flags and not know what they were, dropping the packet. Thankfully this has improved since then.

Data Reception

The URG flag indicates that a later field in the header, Urgent pointer, should be analyzed. This feature indicates that some part of the session should be sent at a more prioritized rate. Just ACK indicates establishment of the SYN/ACK handshake (also called the three way handshake) and response to a FIN+ACK to close a connection. PSH tells the server to send any buffered data right away.

Connection Dropping

RST is a way to force connection drops which can be injected at any point in the network. While the usage is meant for firewalls to drop unwanted packets or servers to refuse connections, it's also had an interesting history. Comcast was ordered by the FCC (PDF) to end its injection of RST packets into peer-to-peer traffic from its customers in order to disrupt the flow. This can further be used for censorship purposes. Given all this research has been done to attempt to track such forced injection.

Starting/Stopping Connections

The last flags are for general handshake usage. SYN+ACK sent by the server is essentially a combination of SYN and ACK flags being set. The same goes for FIN and ACK. Note that SYN and SYN+ACK are meant to simply establish connections and will appear without any TCP data attached to them.

TCP Checksum

Now it's time to see how the checksum works. Don't worry it's really simple:

"The checksum field is the 16-bit ones' complement of the ones' complement sum of all 16-bit words in the header and text. The checksum computation needs to ensure the 16-bit alignment of the data being summed. If a segment contains an odd number of header and text octets, alignment can be achieved by padding the last octet with zeros on its right to form a 16-bit word for checksum purposes. The pad is not transmitted as part of the segment. While computing the checksum, the checksum field itself is replaced with zeros."

Female calculating complex math

Now I could seclude myself in the mountains to meditate enough for enlightenment to explain this to you, but instead I'll do what we call a pro-grammer move. That's right we're going to borrow code! So we'll go ahead and import that:

from scapy.utils import checksum as witchcraft
Enter fullscreen mode Exit fullscreen mode

Witchcraft

This is so hilariously weird that there's an entire RFC dedicated to checksum calculation. Thanks 1980s... Now if things weren't bad enough we actually have to create a "pseudo-header" that's a combination of some parts of the IP header combined with the tcp header with the checksum set to 0 and tcp data:

Visual representation of a pseudo-header

To not mess with the original packet data I'll do a deep copy:

>>> from copy import deepcopy
>>> pseudo_packet = bytearray(deepcopy(packet[20:]))
Enter fullscreen mode Exit fullscreen mode

The bytearray cast is needed because packet type is bytes which is immutable versus bytearray which is mostly the same thing just mutable. We only use the TCP data for the rest of the pseudo packet so we'll just grab that (offset by the IP header 20 bytes). Now unpacking and repacking would be tedious so we're going to go with a simple array manipulation of the packet bytes. The offset of the checksum value is 16 bytes, and the length is 2 bytes so we can use array splicing to retrieve the value:

>>> struct.unpack('!H', pseudo_packet[16:18])[0] == tcp_header[7]
True
Enter fullscreen mode Exit fullscreen mode

So now we can just set it to 0:

>>> pseudo_packet[16] = 0x0
>>> pseudo_packet[17] = 0x0
Enter fullscreen mode Exit fullscreen mode

Now it's time to make the pseudo header:

>>> pseudo_header = struct.pack('!4s4sHH', ip_header[8], ip_header[9], socket.IPPROTO_TCP, len(pseudo_packet))
Enter fullscreen mode Exit fullscreen mode

It's a combination of the src IP address, dest IP address, the numeric value of the protocol (6 for TCP), and the length of just the TCP part of the packet in bytes. Now that everything is prepared it's time to get the actual checksum:

>>> checksum = witchcraft(pseudo_header + pseudo_packet)
>>> checksum
44233
>>> tcp_header[7]
44233
Enter fullscreen mode Exit fullscreen mode

So the checksum matches and we can be assured that the data is solid.

UDP Flow

Next up is UDP or User Datagram Protocol. In this case I'll make a DNS query (DNS runs on UDP):

$ dig mozilla.org

Looking at the Wireshark traffic:

556 1587.220073 172.18.139.193  172.18.128.1    DNS 94  Standard query 0x0748 A mozilla.org OPT
557 1587.236137 172.18.128.1    172.18.139.193  DNS 130 Standard query response 0x0748 A mozilla.org A 44.236.72.93 A 44.236.48.31 A 44.235.246.155
Enter fullscreen mode Exit fullscreen mode

That's everything. This would be similar to if the delivery driver simply knocked and left the package on your doorstep. It makes the delivery person's work faster but there's no confirmation (let's pretend a delivery service didn't email/text you) you actually got the package and you may not have been home to accept it. This also means until you get the package someone could simply steal it off your doorstep (packet loss).

UDP Header Layout

Now UDP is meant to be very very simple. In fact the standard that defines it, RFC768, is probably one of the shortest RFCs I've ever seen. The header format itself is ridiculously short:

Visual display of UDP headers

Unlike TCP there's no flow controls, concerns on sequencing, and any sort of three way handshake. It's a "fire and forget" protocol.

UDP Header Parsing

First I'll provide the bytes for the packet I used in this:

>>> packet = b'E\x00\x00t\n\x1a\x00\x00\x80\x11\xccw\xac\x12\x80\x01\xac\x12\x8b\xc1\x005\xc9\xe4\x00`Q\xf07\xa0\x81 \x00\x01\x00\x03\x00\x00\x00\x00\x07mozilla\x03org\x00\x00\x01\x00\x01\x07mozilla\x03org\x00\x00\x01\x00\x01\x00\x00\x00\x00\x00\x04,\xecH]\xc0\x1d\x00\x01\x00\x01\x00\x00\x00\x00\x00\x04,\xeb\xf6\x9b\xc0\x1d\x00\x01\x00\x01\x00\x00\x00\x00\x00\x04,\xec0\x1f'
>>> ip_header = struct.unpack('!BBHHHBBH4s4s', packet[0:20])
Enter fullscreen mode Exit fullscreen mode

One nice thing about UDP parsing is you don't have to worry about 32 bit options. The header will always be a fixed width of 8 bytes total (64 bit). This means the parsing is as simple as:

udp_header = struct.unpack('!HHHH', packet[20:28])
>>> udp_header
(53, 51684, 96, 20976)
Enter fullscreen mode Exit fullscreen mode

Where 51684 is the local port and 53 is the remote port. Due to 53 being the source port, this means a reply is coming back from a DNS server. Next is the length in bytes which is 96 (yes no more 32 bit word counts):

>>> len(packet[20:])
96
Enter fullscreen mode Exit fullscreen mode

Checksum is actually optional for UDP, but is similar in calculation to the TCP version:

>>> udp_header
(53, 51684, 96, 20976)
>>> pseudo_packet = bytearray(deepcopy(packet[20:]))
>>> pseudo_packet[6] = 0x0
>>> pseudo_packet[7] = 0x0
>>> pseudo_header = struct.pack('!4s4sHH', ip_header[8], ip_header[9], socket.IPPROTO_UDP, len(pseudo_packet))
>>> checksum = witchcraft(pseudo_header + pseudo_packet)
>>> checksum
20976
>>>
Enter fullscreen mode Exit fullscreen mode

After all the TCP work I hope it's understandable why someone made protocol parsing this simple. Now looking at the rest of the data, which starts after the IP header (20 bytes) and UDP header (8 bytes):

>>> packet[28:]
b'7\xa0\x81 \x00\x01\x00\x03\x00\x00\x00\x00\x07mozilla\x03org\x00\x00\x01\x00\x01\x07mozilla\x03org\x00\x00\x01\x00\x01\x00\x00\x00\x00\x00\x04,\xecH]\xc0\x1d\x00\x01\x00\x01\x00\x00\x00\x00\x00\x04,\xeb\xf6\x9b\xc0\x1d\x00\x01\x00\x01\x00\x00\x00\x00\x00\x04,\xec0\x1f'
Enter fullscreen mode Exit fullscreen mode

It's a lot of binary data, though mozilla and org show in a few places. To handle this we'll use dnslib to parse the result:

>>> from dnslib import DNSRecord
>>> DNSRecord.parse(packet[28:])
<DNS Header: id=0x37a0 type=RESPONSE opcode=QUERY flags=RD,AD rcode='NOERROR' q=1 a=3 ns=0 ar=0>
<DNS Question: 'mozilla.org.' qtype=A qclass=IN>
<DNS RR: 'mozilla.org.' rtype=A rclass=IN ttl=0 rdata='44.236.72.93'>
<DNS RR: 'mozilla.org.' rtype=A rclass=IN ttl=0 rdata='44.235.246.155'>
<DNS RR: 'mozilla.org.' rtype=A rclass=IN ttl=0 rdata='44.236.48.31'>
Enter fullscreen mode Exit fullscreen mode

Looking at the results we see a response to a query for mozilla.org has returned and provided 3 IP addresses in the result. Given how crucial DNS is to making requests the compactness of both the UDP header and the binary format makes it a less risky call.

TCP or UDP?

In terms of comparing protocols both are frequently used in the internet. TCP excels more when you want to make sure you obtained all the data. Web pages need the full HTML for rendering. If it was UDP a simple packet loss could mean your paragraph tag goes missing. The same goes for file transfers where a few corrupted bytes could ruin the entire file.

UDP on the other hand excels at high performance. DNS has already been mentioned but this is also important for VOIP and online gaming. VOIP can show a a few artifacts or maybe some stuttering if packet corruption happens. Waiting for the three way handshake however would be too expensive. If you want to learn more on how UDP is utilized in online gaming I highly recommend Glenn Fiedler's series on the topic. Again, you don't want to be waiting around for the free way handshake for your game action to register to the server.

One more interesting feature is that UDP is able to utilize multicast to reach multiple clients at once. TCP is unable to do this as the three way handshake forces a 1:1 connection. Multicast is useful for network device discovery such as Simple Service Discovery Protocol (SSDP). There is also a protocol Real-Time Publish-Subscribe (RTPS) (yes, the organization is actually called OMG) which is used to help meet a balance between UDP's robustness issues while providing multicast functionality. This write up on their forums is a good resource.

Conclusion

This concludes the rather long look into TCP and UDP. If there's one thing you should definitely get out of this is that while all this is interesting learning wise you're better off using something like scapy for real world usage or Wireshark if you just want to see the packet data. It's veryyyy easy to make mistakes (which this article spanning several days should attest to, especially the checksum calculation). In the next planned installment we'll be looking at how servers work.

Top comments (0)