DEV Community

Shivakumar
Shivakumar

Posted on

Networking Tools: netcat, tcpdump, dig, nmap

1. Netcat (nc): The Swiss Army Knife

Netcat reads and writes data across network connections using TCP or UDP. It is the rawest form of network communication.

Core Modes

  • Client Mode (Connect): Acts like Telnet. Used to test if a port is open and accepting traffic.
nc -vz 192.168.1.5 80
# -v: Verbose (tells you what happened)
# -z: Zero-I/O mode (scans for listening daemons, doesn't send data)

Enter fullscreen mode Exit fullscreen mode
  • Server Mode (Listen): Creates a temporary server. Great for testing firewall rules (e.g., "Can Server A reach Server B on port 9090?").
# On Server B (Receiver):
nc -l 9090
# -l: Listen mode

Enter fullscreen mode Exit fullscreen mode
  • File Transfer (The "Hack"): If scp or rsync aren't available, you can pipe files through raw sockets.
# Receiver:
nc -l 9090 > received_file.txt
# Sender:
nc [Receiver_IP] 9090 < original_file.txt

Enter fullscreen mode Exit fullscreen mode

2. Tcpdump: The CLI Microscope

When you can't use Wireshark (because there is no GUI), you use tcpdump. It captures packets directly from the kernel.

Key Flags to Memorize

  • -i eth0: Listen on interface eth0 (or any for all interfaces).
  • -n: Crucial. Don't resolve Hostnames or Ports. (Shows 1.2.3.4:80 instead of google.com:http). This speeds up output significantly.
  • -w capture.pcap: Write output to a file (so you can open it in Wireshark later).
  • -v: Verbose (show more header details like TTL, ID).

The Filter Syntax (BPF)

It uses the same filter language as Wireshark.

# Capture only traffic from a specific IP on port 80
sudo tcpdump -i eth0 -n src 192.168.1.5 and dst port 80

# Capture everything EXCEPT SSH (so you don't flood your own logs)
sudo tcpdump -i eth0 port not 22

Enter fullscreen mode Exit fullscreen mode

3. Dig (dig): The DNS Scalpel

nslookup is deprecated/old. dig (Domain Information Groper) is the modern standard because it shows the exact query and response structure.

Understanding the Output

Running dig google.com gives you:

  1. HEADER: Status (e.g., NOERROR or NXDOMAIN). If you see NXDOMAIN, the domain doesn't exist.
  2. QUESTION SECTION: What you asked for.
  3. ANSWER SECTION: The result (IPs).
  4. AUTHORITY SECTION: Who owns the domain (Nameservers).
  5. ADDITIONAL SECTION: IPs of the nameservers.

Power User Commands

  • Trace the Recursion: See the full path from Root(.) to TLD(.com) to Auth Server.
dig +trace google.com

Enter fullscreen mode Exit fullscreen mode
  • Short Mode: Great for scripting. Returns only the IP.
dig +short google.com

Enter fullscreen mode Exit fullscreen mode
  • Direct Query: Bypass your local DNS and ask a specific server (e.g., ask Google's 8.8.8.8 directly).
dig @8.8.8.8 google.com

Enter fullscreen mode Exit fullscreen mode

4. Nmap: The Cartographer

Nmap scans a network to map "live" hosts and open ports. It works by sending packets and analyzing the subtle differences in responses.

Scan Types

  • SYN Scan (-sS): The "Stealth" scan. It sends a SYN packet. If the server replies SYN-ACK, Nmap knows the port is open but sends a RST (Reset) immediately. It never completes the 3-way handshake, so it often doesn't show up in application logs.
  • Requires sudo.

  • Version Detection (-sV): Connects to the port and listens to the "Banner" to guess the software version (e.g., "Apache 2.4.41").

  • OS Detection (-O): Analyzes IP TTLs and TCP Window sizes to guess the Operating System (Linux, Windows, connection stack differences).

# The "Aggressive" Scan (OS detection, Version detection, Script scanning, Traceroute)
nmap -A 192.168.1.5

Enter fullscreen mode Exit fullscreen mode

5. Debugging: Latency vs. Bandwidth

In DevOps, "The network is slow" is a vague complaint. You must distinguish between two completely different bottlenecks.

A. Latency (The "Distance")

  • Definition: The time it takes for a single packet to travel from Source to Destination.
  • Analogy: The speed limit of the road. Even if the road is empty, it takes time to drive from New York to London.
  • The Cause: Physical distance (fiber optic length), number of router hops, congested queues.
  • Tools:
  • ping: Measures RTT (Round Trip Time).
  • mtr (My Traceroute): Combines ping and traceroute. Shows packet loss at each hop.
  • Tip: If loss starts at Hop 3 and continues to the end, Hop 3 is the problem. If loss is only at Hop 3 but Hop 4 is 0%, Hop 3 is just de-prioritizing ICMP (ignoring pings), which is fine.

B. Bandwidth (The "Width")

  • Definition: The maximum amount of data that can be transmitted in a fixed amount of time.
  • Analogy: The number of lanes on the highway.
  • The Cause: Link capacity (1Gbps cable vs 100Mbps cable).
  • Tools:
  • iperf3: The gold standard. requires installation on both ends (client and server). It floods the link with data to test pure capacity.
# Server side
iperf3 -s
# Client side
iperf3 -c [Server_IP]

Enter fullscreen mode Exit fullscreen mode

C. The Hidden Trap: Throughput & Window Size

You can have huge Bandwidth (10Gbps) and low Throughput if Latency is high.

  • TCP Window Size: TCP waits for an acknowledgment (ACK) before sending more data. If the Latency (RTT) is high, the sender spends most of its time waiting, not sending.
  • Bandwidth-Delay Product (BDP): In "Long Fat Networks" (High Bandwidth + High Latency, like Trans-Atlantic cables), you must tune the TCP Window Size to keep the pipe full.
  • Formula:
  • DevOps Fix: Tuning Linux Kernel parameters (net.ipv4.tcp_window_scaling).

Here is a Real-World Troubleshooting Cheat Sheet.

The Scenario:
You are a DevOps Engineer. A developer complains: "The Web App can't connect to the Database (PostgreSQL), or it's extremely slow."

Your Mission: Isolate the root cause using the tools we just discussed.


Step 1: The "Is it Alive?" Check (Layer 3 - Network)

Goal: Determine if the Database server is reachable network-wise.

Tool: mtr (or ping)
Run this from the Web Server:

mtr -r -c 10 db.prod.internal

Enter fullscreen mode Exit fullscreen mode

Analyze the Output:

  • Scenario A (Good): 0% Packet Loss, Low Latency (<1ms for LAN).
  • Verdict: Network path is fine. Proceed to Step 2.

  • Scenario B (Bad - 100% Loss): "Destination Host Unreachable."

  • Verdict: The server is down, or there is no route (Routing Table issue).

  • Scenario C (Bad - High Loss): Loss starts at Hop 2.

  • Verdict: A specific router/switch in the path is failing.


Step 2: The "Address Book" Check (Layer 7 - DNS)

Goal: Ensure the application is trying to connect to the correct IP address.

Tool: dig

dig +short db.prod.internal

Enter fullscreen mode Exit fullscreen mode

Analyze the Output:

  • Output: 10.0.1.50
  • Action: Compare this IP with your AWS Console/Inventory. Is it the correct DB server?
  • Trap: Sometimes a developer hardcodes an old IP in /etc/hosts. Check that file too!
  • Trap: If you get NXDOMAIN, the DNS record is missing entirely.

Step 3: The "Is the Door Open?" Check (Layer 4 - Transport)

Goal: The server is up, and the IP is right. Is the Database software listening on Port 5432, or is a Firewall blocking us?

Tool: nc (Netcat) or telnet

nc -zv 10.0.1.50 5432

Enter fullscreen mode Exit fullscreen mode

Analyze the Output:

  • Scenario A (Success): Connection to 10.0.1.50 5432 port [tcp/postgresql] succeeded!
  • Verdict: Firewall is open, DB is listening. The issue is likely Application Layer (wrong password, DB overload).

  • Scenario B (Connection Refused): Ncat: Connection refused.

  • Verdict: Packet reached the server, but the Server said "Go Away." The DB service is likely crashed/stopped.

  • Scenario C (Timeout): It hangs forever...

  • Verdict: Firewall Drop. The packet hit a black hole (Security Group/UFW). It never got a reply.


Step 4: The "Deep Dive" (Packet Analysis)

Goal: The connection is "flaky" or "slow," but netcat works intermittently. We need to see the handshake.

Tool: tcpdump
Run this on the Web Server while triggering the database connection:

# Capture traffic to the DB IP on port 5432, don't resolve names (-n)
sudo tcpdump -i eth0 -n host 10.0.1.50 and port 5432

Enter fullscreen mode Exit fullscreen mode

Analyze the Output:

  • Case 1: The "SYN Flood" (Firewall/Packet Loss)
12:01:01 IP WebServer > DBServer: Flags [S], seq 123...
12:01:02 IP WebServer > DBServer: Flags [S], seq 123... (Retransmission)
12:01:04 IP WebServer > DBServer: Flags [S], seq 123... (Retransmission)

Enter fullscreen mode Exit fullscreen mode
  • Diagnosis: You see only [S] (SYN) packets going out, but no reply. The other side is ignoring you. Confirm Firewall/Security Groups.

  • Case 2: The "Reset" (Service Down)

12:01:01 IP WebServer > DBServer: Flags [S]
12:01:01 IP DBServer > WebServer: Flags [R.], seq 0

Enter fullscreen mode Exit fullscreen mode
  • Diagnosis: You see an [R] (RST) flag immediately. The server OS received the request but no application was bound to that port to handle it. Check if Postgres Service is running.

  • Case 3: The "Zero Window" (Overload)

12:01:01 IP DBServer > WebServer: Flags [.], win 0

Enter fullscreen mode Exit fullscreen mode
  • Diagnosis: win 0 means the Database Server is screaming "STOP! My buffer is full." It cannot process data fast enough. The DB is CPU/Memory starved.

Summary Checklist

Symptom Tool to Use Likely Cause
"Host Unreachable" ping / mtr Network down, Routing issue.
"NXDOMAIN" dig DNS typo or missing record.
"Connection Refused" nc -zv Service (Postgres) is stopped.
"Connection Timed Out" nc -zv Firewall (AWS Security Group) Dropping packets.
"Connection Reset" tcpdump Service crashed or misconfigured Proxy.
"Slow / Stalling" tcpdump Packet Loss (Retransmissions) or Server Overload (Zero Window).

Top comments (0)