Pen Testing Tools Explained: Nessus, Burp Suite, Nmap, Metasploit — What They Do and What They Miss

#cybersecurity #burpsuite #nmap #metasploit

TLDR: Nessus, Burp Suite, Nmap, Metasploit, ZAP — these are the tools in every pen tester's arsenal. You've probably heard of most of them. Your DevOps team may already run some of them. But here's what most vendors won't tell you: every single one of these tools has a hard boundary where it stops working — and a human takes over. Understanding that boundary is the difference between a security programme that checks boxes and one that actually finds what an attacker would find.

Tools Don't Hack. Testers Do.

There's a narrative in the security industry — reinforced by vendors, marketing decks, and compliance frameworks — that the right tool equals the right result. Run Nessus. Get a report. Fix the findings. Done.

I've been doing this long enough to know that story is comfortable but incomplete. Tools are how a pen tester starts. They are never how a pen tester finishes. Here's an honest breakdown of what each major tool actually does — and where each one hands off to human judgment.

Nmap — The First Thing We Run

What it does: Nmap is a network scanner. It discovers what's alive on a network, which ports are open on each host, what services are running on those ports, and — in many cases — what software versions are running. It's the reconnaissance layer. Before a tester touches anything else, they need a map.

A typical Nmap scan tells us: there's a web server on port 443, an SSH service on port 22, a database port that really shouldn't be internet-facing, and a forgotten dev server running an outdated version of nginx.

Where it stops: Nmap tells you what exists. It has no opinion on whether any of it is exploitable, misconfigured, or logically flawed. An open port is just an open port until a human decides what to do with it.

Nessus / Qualys / OpenVAS — CVE Matching at Scale

What they do: These are vulnerability scanners. They take the inventory Nmap builds and match it against massive databases of known CVEs (Common Vulnerabilities and Exposures). They identify unpatched software, weak cipher suites, default credentials still in place, deprecated protocols, and configuration issues measured against security benchmarks like CIS controls.

This is genuinely valuable. If you're running Apache 2.4.49 and it's September 2021, Nessus will tell you that it's vulnerable to a path traversal and RCE exploit that's actively being weaponised. That's not a trivial finding.

Where they stop: Nessus knows nothing about your application. It doesn't know that the /admin panel requires authentication — or that the authentication can be bypassed with a specific header. It doesn't know that your payment flow has a race condition. It matches signatures. Anything that requires understanding context, intent, or business logic is outside its scope.

As we wrote in our post on scanner limitations, these tools find the known and miss the novel. Attackers aren't limited to the known.

OWASP ZAP / Nikto — Web Application Baseline

What they do: Where Nessus scans infrastructure, ZAP and Nikto scan web applications specifically. They crawl your app, submit inputs with payloads designed to trigger common vulnerabilities — reflected XSS, basic SQL injection, open redirects, missing security headers — and report what fires.

ZAP in particular has an active scanner mode that is genuinely useful for catching low-hanging fruit quickly. If your application is missing a Content-Security-Policy header or has a reflected XSS in a search field, ZAP will find it.

Where they stop: They crawl what they can see. Anything that requires authentication state, multi-step flows, or understanding how parts of the application interact with each other is largely invisible to an automated crawler. They also generate meaningful false positive rates — which means someone still has to review and validate every finding manually anyway.

Burp Suite — Where Things Get Interesting

What it does: Burp Suite is the tool I spend the most time in on any engagement, and it's worth explaining why — because it operates differently from everything above.

Burp sits as a proxy between the tester's browser and the target application. Every request and response passes through it. That means the tester sees exactly what data is being sent, exactly what the server responds with, and can intercept, modify, and replay any request in real time.

In scanner mode, Burp can run automated checks similar to ZAP. But that's not where its value lies.

The manual mode is where it becomes a different category of tool entirely. A tester who is actively working through Burp — replaying authentication requests with modified parameters, changing user IDs in API calls, testing how the application responds to unexpected inputs in multi-step flows — is doing something fundamentally different from running a scanner. They are thinking adversarially, in real time, based on what the application reveals about itself as they probe it.

This is the gap between a tool that finds and a tester that tests. Burp is the interface through which that testing happens.

Metasploit — Exploit Validation, Not Discovery

What it does: Metasploit is a framework for running known exploits against known vulnerabilities. If Nessus tells you that a host is running a service vulnerable to CVE-2024-XXXX, Metasploit likely has a module that can confirm whether that vulnerability is actually exploitable in your specific environment.

It's also an excellent framework for post-exploitation — simulating what an attacker does after they gain initial access. Can they move laterally? Escalate privileges? Reach your database server? Metasploit is how testers answer those questions with controlled, validated techniques.

Where it stops: Metasploit does not discover vulnerabilities. It exploits the ones that are already known and catalogued. The most significant vulnerabilities in most modern applications — business logic flaws, broken access control, chained exploits — have no Metasploit module because they're unique to your codebase. You can't automate exploiting a vulnerability that's specific to the way your application handles a password reset.

The Human Judgment Layer

Here's how these tools actually fit into a real engagement.

We start with Nmap — building the map. Then automated scanning with Nessus or equivalents — clearing the known CVE landscape quickly. Then Burp and ZAP for web application baseline coverage. All of this happens in the first few hours.

Then the tools go to the background and the actual work begins.

The manual phase is where a tester asks: what does this application actually do? What happens if I call this API endpoint as User A but with User B's resource ID? What happens if I complete step 1 of this checkout flow and then skip directly to step 3? What happens if I submit a negative quantity? What happens if I call the password reset endpoint 200 times in 60 seconds?

None of those questions have tool answers. They have human answers — arrived at through curiosity, pattern recognition, and a working understanding of how developers make mistakes under deadline pressure.

If you want to understand what a full manual engagement covers, this post walks through our complete checklist. And if you've never commissioned a pen test before and aren't sure what to expect, start here.

The Honest Conclusion

Every tool in this list is genuinely useful. We use all of them. But they are starting points — ways of quickly eliminating the obvious so we can focus on the interesting.

The interesting is where your actual risk lives. It's in the places an automated tool can't reach because reaching there requires understanding your application, your architecture, and your users well enough to ask the questions an attacker would ask.

Tools find what they were built to find. Attackers find everything else.

Are you running any of these tools in your own pipeline? Curious whether your team treats them as a starting point or a final answer — drop a comment.

At Kuboid Secure Layer, manual assessment is the core of every engagement — tools included, human judgment first. See what a full assessment covers or book a free consultation to talk through what your application needs.