Introduction
I recently just started taking proper class in Networking and while I was deeply fascinated by the concept of networking, I found it a bit difficult understanding TCP(Transmission Control Protocol).
A couple of basic concepts we would be using are:
- Opening a network socket that allows us send TCP packets
- Sending a HTTP request to google.com using
GET
- Getting and Reading the reply we get
Also note, correct error handling wasn't taken into place with this.
TCP handshake
The first thing we will need is to make a handshake with google. Here's a way a TCP handshake works:
Assuming we have a two syllable word index which is broken down into IN-DEX.
The user sending a HTTP requesting gets to use: IN
Google who is accepting this request is assigned: INDEX
while me the user gets to be assigned: DEX
In a simple code, this will look like this:
# My local network IP
src_ip = "192.168.0.11"
# Google's IP
dest_ip = "96.127.250.29"
# IP header: this is coming from me, and going to Google
ip_header = IP(dst=dest_ip, src=src_ip)
# Specify a large random port number for myself (59333),
# and port 80 for Google The "S" flag means this is
# a SYN packet
syn = TCP(dport=80, sport=59333,
ack=0, flags="S")
# Send the SYN packet to Google
# scapy uses '/' to combine packets with headers
response = srp(ip_header / syn)
# Add the sequence number
ack = TCP(dport=80, sport=self.src_port,
ack=response.seq, flags="A")
# Reply with the ACK
srp(ip_header / ack)
What are sequence numbers?
The idea of TCP is ensuring that we're able to resend packets in a case where some packets go missing. Sequence numbers is a way to check if we have missed packets. In a case where google sends a 3 packets with a size of 100, 200, and 300 bytes. While also assuming the initial sequence number to be 0. Now, these packets will have numbers of 0, 100, 300 and 600.
A TCP Sequence Number is a 4-byte field in the TCP header that shows the first byte of the outgoing segment. It also keeps track of how much data has been transferred and received. The TCP Sequence Number field is always set.
An example is, the sequence number for a packet is X. The length for this packet is Y. If the packet is transferred to another side efficiently, then the sequence number for the next packet is X+Y.
When we send or resend packets, how does google then know we have a missed packet? Well, for every time a packet is received by google we also need to send an ACK saying we got the packet with the sequence number. If as at when the server notices the packet hasn't been ACKed(An ACK packet is any TCP packet that acknowledges receiving a message or series of packets), it will then resend it. Find out more concepts on TCP
What happens when you have a TCP stack
If you ran the above code, you'll notice there was an error and we got a different packet. In this case what is happening is:
Python prgram: IN
Google: INDEX
Kernel: Didn't ask for this
Python Program: ..
The question then lies in how do we go around the kernel? One way to do this is through ARP spoofing which is to act as though we have a different IP address 192.168.0.129
.
The exchange now looks like:
me: sends packets for 192.168.0.129 to the address
router: goes through with it
my Python program: IN (from 192.168.0.129)
google: INDEX
kernel: this isn't my IP address! <ignore>
my Python program: uses ACK
If you notice, this then works and we can now send packets to get our responses without the kernel getting in the way.
How to get a webpage
To prevent google from sending the html for google we need to take into account the following:
- Ensuring to put together a packet with a HTTP GET request
- Adequately making sure we can listen for a lot of packets now just a single packet.
- Fixing bugs with sequence numbers
- Closing the connection properly
The Python set-back
If you notice, once everything got working, using wiresharkto look at the packets being sent looks like this:
User/google: <tcp handshake>
User: GET google.com
google: 100 packets
User: 3 ACKs
google: <starts resending packets>
User: a few more ACKs
google: <reset connection>
In the above scenario, google will send packets faster that the python program could handle, sending ACKs. Google server will then assume there were possible network problems causing the user not to ACK the packets.
This will then reset the connection because google will decide there were connection problems. But we do know the connection is fine and the program was adequately responding. It was an issue where the python program was slow to acknowledge the packets.
Conclusion
One of the set-backs we received was how slow the python program. It is also important to properly understand the main concepts associated with TCPs, how they work and how to handle requests as these will help in ensuring you understand in-depth what TCPs entail and how to solve bugs when they are encountered.
Top comments (0)