loading...
Cover image for Network Scanning with Scapy in Python

Network Scanning with Scapy in Python

zeyu2001 profile image Zhang Zeyu ・6 min read

What is Network Scanning?

Network Scanning is the process of gathering information about devices in a computer network, through employing network protocols. It is especially relevant for network administrators, penetration testers and, well… hackers.

Prerequisites

You should know basic Python. Other than that, not much! I will be writing on some basic network theory before getting into the actual code, so if you already know this stuff, feel free to skip ahead!

All code for this tutorial can be found at my GitHub repository.

GitHub logo zeyu2001 / network-scanner

Simple network scanner built with Scapy for Python

Protocols, Protocols, Protocols

Communications over networks use what we call a protocol stack — building higher-level, more sophisticated conversations on top of simpler, more rudimentary conversations. Network layers can be represented by the OSI model and the TCP/IP model. Each network layer corresponds to a group of layer-specific network protocols.

For the purpose of this tutorial, we will only concern ourselves with the ARP protocol and the TCP protocol.

Address Resolution Protocol (ARP)

ARP maps IP addresses to MAC addresses. IP addresses are logical addresses, while MAC addresses are physical addresses. When computers communicate with each other over the network, they will specify a target IP address. However, switches (which act as packet forwarders) don’t understand IP addresses — they can only make forwarding decisions based on MAC addresses. Hence, computers need to determine the MAC address of their intended recipient before sending out a packet.

If a computer wishes to send a packet to 192.168.52.2, it will first send an ARP request, asking all devices in the network “who is IP address 192.168.52.2?” The computer with IP address 192.168.52.2 will respond with “Hi, I am 192.168.52.2. My MAC address is 03-CA-4B-2C-13–8A.”

As you might have noticed, as ARP is a standalone protocol, anyone can send an ARP request at any time to learn about the devices on the network through ARP replies. We will use Scapy to scan the network using ARP requests and create a list of IP address to MAC address mappings.

Transmission Control Protocol (TCP)

TCP is a transport layer protocol that most services run on. It is a connection-oriented protocol, meaning that two devices will need to set up a TCP connection before exchanging data. This is achieved using a 3-way handshake.

TCP uses port numbers to differentiate between different applications on the same device. For example, if I am running both Firefox and Chrome on my computer, the OS uses port numbers to distinguish between the two applications so that webpages meant for Chrome don’t show up on Firefox.

When Host P wishes to connect to Host Q, it will send a SYN packet to Host Q. If Host Q is listening on the target port and willing to accept a new connection, it will reply with a SYN+ACK packet. To establish the connection, Host P sends a final ACK packet.

Using Scapy, we will send SYN packets to a range of port numbers, listen for SYN+ACK replies, and hence determine which ports are open.

Scapy

Scapy is a packet manipulation tool written in Python. If you haven’t already, you need to install Scapy with pip.

$ pip install scapy

Now, we can start trying out the basic features of Scapy.

$ python3
...
>>> from scapy.all import *

We can create a packet like so:

>>> Ether()
<Ether  |>

We have created an Ethernet frame. This corresponds to the data link layer (Layer 2) of the OSI model. If we don’t pass in any parameters, default values will be used.

We can specify parameters, such as the destination MAC address:

>>> p = Ether(dst='ff:ff:ff:ff:ff:ff')
>>> p.show()
###[ Ethernet ]###
  dst       = ff:ff:ff:ff:ff:ff
  src       = d0:81:7a:b0:bb:0c
  type      = 0x9000

Here, we specified dst='ff:ff:ff:ff:ff:ff', which is the broadcast address. The packet is addressed to all devices within the local network.

We can stack higher layer protocols on top of the Ethernet protocol, like so

>>> p = Ether() / IP()
>>> p.show()
###[ Ethernet ]###
  dst       = ff:ff:ff:ff:ff:ff
  src       = 00:00:00:00:00:00
  type      = IPv4
###[ IP ]###
     version   = 4
     ihl       = None
     tos       = 0x0
     len       = None
     id        = 1
     flags     =
     frag      = 0
     ttl       = 64
     proto     = ip
     chksum    = None
     src       = 127.0.0.1
     dst       = 127.0.0.1
     \options   \

Here, we stacked IP, a network layer (Layer 3) protocol on top of Ethernet, a data link layer (Layer 2) protocol.

ARP Scanner

from scapy.all import *

def arp_scan(ip):

    request = Ether(dst="ff:ff:ff:ff:ff:ff") / ARP(pdst=ip)

    ans, unans = srp(request, timeout=2, retry=1)
    result = []

    for sent, received in ans:
        result.append({'IP': received.psrc, 'MAC': received.hwsrc})

    return result

Creating the ARP Request

Wow, what did we do here? First, we stacked ARP on top of Ethernet, and set the Ethernet destination address to the broadcast address so that all devices on the local network receive this ARP request.

Sending and Receiving

Next, we called srp() to send the ARP request, and wait for a response. We specified the timeout to be 2 seconds, and if we do not receive an ARP reply within 2 seconds, we retry 1 time before giving up.

Analysing the Results

srp() returns two lists: a list of answered requests and a list of unanswered requests. Within the list of answered requests are nested lists: [<sent_packet>, <received_packet>].

Information contained within the packets is stored as attributes. We use received.psrc to get the source IP address of the reply and received.hwsrc to get the source MAC address of the reply.

TCP Scanner

from scapy.all import *

def tcp_scan(ip, ports):
    try:
        syn = IP(dst=ip) / TCP(dport=ports, flags="S")
    except socket.gaierror:
        raise ValueError('Hostname {} could not be resolved.'.format(ip))

    ans, unans = sr(syn, timeout=2, retry=1)
    result = []

    for sent, received in ans:
        if received[TCP].flags == "SA":
            result.append(received[TCP].sport)

    return result

Creating the SYN Packet

Here, we create an IP packet and specify the destination IP address, then stack TCP on top of it, specifying the destination ports and setting the SYN flag.

Note that dport can be either a single port or a range of ports.

  • If dport=80, the TCP packet will only be sent to port 80 (HTTP).

  • If dport=[80, 443], the TCP packet will be sent to both port 80 (HTTP) and port 443 (HTTPS)

  • If dport=(0, 1000), the TCP packet will be sent to all ports from port 0 to port 1000.

Hence, the ports parameter of our function can be either an integer, a list or a tuple.

flags="S" sets the SYN flag in the TCP packet. If the receiving port is open, it should reply with a packet with flags set to "SA" (for SYN+ACK).

socket.gaierror

socket.gaierror is raised when either the IP address provided is invalid, or a hostname provided could not be resolved by the DNS service. We catch this exception and raise a more meaningful exception to the user instead.

Sending and Receiving

Again, we send the packet and wait for a response. We use sr() instead of srp() because we are dealing with a Layer 3 packet. Both methods return the same results but srp() is for Layer 2 packets.

Analysing the Results

Not all replies are SYN+ACK packets! If a port is not open, the target device may respond with an RST (reset) packet to inform our OS that it does not want to establish a TCP connection.

We use if received[TCP].flags == “SA” to check the TCP flags of the received packet. Note that we can use <packet>[<protocol>] to access the protocol-specific information of the packet. Again, information is stored as attributes and we use received[TCP].flags to obtain the flags of the received packet. received[TCP].sport is the source port of the received packet.

Full Code

I used argparse to turn our scanner into a command-line application, and added some documentation. The full code is embedded below. You can also find it at my GitHub repository.

Conclusion

That’s all! I hope that you’ve enjoyed reading this as much as I have enjoyed writing it. If you have any questions, please feel free to let me know in the comments.

Posted on by:

zeyu2001 profile

Zhang Zeyu

@zeyu2001

Simple is better than complex. Complex is better than complicated.

Discussion

markdown guide