DEV Community

Cover image for Building a Fast Automated Web Security Scanner Using Python and Open-Source Linux Tools
Ganesh hari
Ganesh hari

Posted on

Building a Fast Automated Web Security Scanner Using Python and Open-Source Linux Tools

Building a Fast Automated Web Security Scanner Using Python and Linux Tools

In modern web security testing, automation plays a critical role in quickly identifying potential vulnerabilities and misconfigurations. Instead of manually running multiple tools one by one, I built a Python-based automated scanner that integrates widely used Linux security tools into a single workflow.

This project combines the capabilities of Nmap, WhatWeb, and Nikto to analyze a target website efficiently and present results in both technical and human-readable formats.

Objective of the Project

The goal of this project is to:

  • Automate web security reconnaissance
  • Reduce manual effort in running multiple tools
  • Provide a simplified explanation of technical results
  • Generate structured output for further analysis

This approach is particularly useful for beginners, students, and developers who want to understand web security without getting overwhelmed by raw command-line outputs.

How the Scanner Works

The system is designed using Python as the orchestration layer. It interacts with external security tools using the subprocess module and processes their outputs programmatically.

Workflow Overview

  1. User inputs a target URL
  2. Python resolves the domain to an IP address
  3. Multiple tools are executed sequentially
  4. Outputs are parsed using regex
  5. Results are summarized and explained
  6. Optional JSON report is generated

Tool Integration and Use Cases

*1. Nmap – Port and Service Discovery *
Nmap is used to identify open ports and exposed services on the target system.

Purpose:

  • Detect open TCP ports
  • Identify network exposure
  • Determine HTTPS availability

Implementation Highlights:

  • Fast scan mode (-F)
  • Aggressive timing (-T4)
  • Filters only open ports (--open)

This ensures faster execution while still providing meaningful results.

2. WhatWeb – Technology Detection

WhatWeb helps identify the technologies used by a website.

Purpose:

  • Detect web server (Apache, Nginx, etc.)
  • Identify programming languages (PHP)
  • Detect CMS platforms like WordPress

Parsing Strategy:

The output is analyzed using regular expressions to extract:

  • Server type
  • PHP version
  • CMS presence

This enables structured reporting instead of raw text analysis.

3. Nikto – Vulnerability Scanning

Nikto is used to detect common web server vulnerabilities and misconfigurations.

Purpose:

  • Identify outdated software
  • Detect exposed files
  • Highlight security issues

Optimization:

  • Limited scan scope using -Tuning
  • Maximum execution time capped at 120 seconds

This balances speed and effectiveness.

Intelligent Result Processing

One of the key features of this project is not just running scans, but making the results understandable.

Example Enhancements:

  • Open ports are interpreted as “network access points”
  • HTTPS detection is explained in terms of user data protection
  • CMS detection includes security advice Vulnerability count is converted into a risk level:
  1. Low
  2. Medium
  3. High

This transforms technical output into meaningful insights.

Human-Friendly Report Generation

Instead of overwhelming users with raw logs, the scanner produces a structured explanation:

  1. Web server details
  2. Programming language usage
  3. CMS detection
  4. Open ports and their significance
  5. Security findings and risk level
  6. Practical security recommendations

This makes the tool useful not only for professionals but also for learners.

JSON Export Feature

The scanner includes an optional feature to export results in JSON format.

Benefits:

  • Easy storage of scan results
  • Integration with other systems
  • Future dashboard visualization
  • API-based extensions

Each scan is saved with:

  • Target details
  • Detected technologies
  • Open ports
  • Risk level
  • Timestamp

Performance Considerations

To improve efficiency, the following optimizations were implemented:

Fast scanning modes for all tools
Limited scan scope where possible
Reduced unnecessary checks
Silent subprocess execution

Despite using multiple tools, the scanner maintains a balance between speed and accuracy.

Challenges Faced

During development, several practical challenges were encountered:

  • Parsing unstructured CLI output
  • Managing scan execution time
  • Handling tool dependencies in Linux
  • Ensuring stable subprocess execution

These challenges were addressed through controlled command execution and output parsing strategies.

Future Improvements

This project can be extended into a full-scale security platform by adding:

  • Parallel execution using threading or multiprocessing
  • Integration with tools like Gobuster and OWASP ZAP
  • Web-based dashboard using Flask
  • Database storage for scan history
  • Authentication system for multiple users

Conclusion

This project demonstrates how traditional command-line security tools can be transformed into a programmable and automated security solution using Python.

By combining tools like Nmap, WhatWeb, and Nikto, and adding intelligent parsing and reporting, we can build a system that is both technically powerful and user-friendly.

Automation in cybersecurity is not just about speed—it’s about making complex data accessible, actionable, and scalable.

Final Note

This blog presents the concept, workflow, and output of my automated scanner. The implementation includes Python-based subprocess execution, result parsing, and structured reporting.

In the next step, I will extend this project by comparing:

  1. Manual Nmap execution
  2. Python subprocess-based execution

to further improve automation and performance.

import subprocess
import socket
import re
import json
from datetime import datetime


# --------------------------------------------------
# RUN COMMAND (FAST + SILENT)
# --------------------------------------------------
def run(cmd):
    try:
        result = subprocess.run(
            cmd,
            capture_output=True,
            text=True
        )
        return result.stdout
    except Exception as e:
        return str(e)


# --------------------------------------------------
# PARSE WHATWEB OUTPUT
# --------------------------------------------------
def parse_whatweb(output):

    server = re.search(r"HTTPServer\[(.*?)\]", output)
    php = re.search(r"PHP\[(.*?)\]", output)
    wordpress = "WordPress" in output

    return {
        "server": server.group(1) if server else "Unknown",
        "php": php.group(1) if php else "Unknown",
        "wordpress": wordpress
    }


# --------------------------------------------------
# PARSE NMAP OUTPUT
# --------------------------------------------------
def parse_nmap(output):
    ports = re.findall(r"(\d+)/tcp\s+open", output)
    return ports


# --------------------------------------------------
# PARSE NIKTO OUTPUT
# --------------------------------------------------
def parse_nikto(output):
    issues = len(re.findall(r"\+", output))
    return issues


# --------------------------------------------------
# HUMAN FRIENDLY EXPLANATION
# --------------------------------------------------
def explain_results(tech, ports, issues):

    print("\n==============================")
    print("   EASY EXPLANATION REPORT")
    print("==============================\n")

    # SERVER
    print("🌐 Website Server")
    print("------------------")
    print(f"The website runs on '{tech['server']}' server software.")
    print("A web server is responsible for sending webpages to visitors.")
    print("This is normal for all websites.\n")

    # PHP
    print("⚙️ Website Programming Language")
    print("--------------------------------")
    if tech['php'] != "Unknown":
        print(f"The website uses PHP version {tech['php']}.")
        print("PHP helps websites handle logins, forms, and dynamic content.")
        print("If not updated regularly, older versions may have risks.\n")
    else:
        print("Programming language could not be detected.\n")

    # CMS
    print("🧩 Website Platform (CMS)")
    print("--------------------------")
    if tech['wordpress']:
        print("The website is built using WordPress.")
        print("WordPress is popular and easy to manage.")
        print("Plugins and themes must be updated for security.\n")
    else:
        print("No common CMS platform detected.\n")

    # PORTS
    print("🔌 Network Access (Open Doors)")
    print("-------------------------------")
    print(f"Open ports detected: {', '.join(ports)}")
    print("Ports are like doors allowing communication with the server.")

    if "443" in ports:
        print("✅ Secure HTTPS encryption is enabled.")
        print("This protects user data while browsing.\n")
    else:
        print("⚠️ Secure HTTPS was NOT detected.\n")

    # SECURITY LEVEL
    print("🛡️ Security Findings")
    print("--------------------")

    if issues < 5:
        level = "LOW 🟢"
        message = "Only minor observations were found."
    elif issues < 15:
        level = "MEDIUM 🟡"
        message = "Some improvements are recommended."
    else:
        level = "HIGH 🔴"
        message = "Multiple potential risks detected."

    print(f"Risk Level: {level}")
    print(message)
    print("\nThis does NOT mean the website is hacked.")
    print("It only shows possible improvements.\n")

    # ADVICE
    print("✅ Simple Advice")
    print("----------------")
    print("• Keep website software updated")
    print("• Hide version information")
    print("• Always use HTTPS")
    print("• Perform regular security scans")

    print("\n==============================\n")


# --------------------------------------------------
# MAIN PROGRAM
# --------------------------------------------------
def main():

    print("\n===================================")
    print("      FAST AUTOMATED WEB SCANNER")
    print("      Nmap | WhatWeb | Nikto")
    print("===================================\n")

    url = input("Enter target URL (www.*): ").strip()

    if not url.startswith("www."):
        print("❌ URL must start with www.")
        return

    domain = url.replace("www.", "")

    # Resolve IP
    try:
        ip = socket.gethostbyname(domain)
        print(f"\nTarget IP Address: {ip}")
    except:
        print("❌ Could not resolve domain.")
        return

    # ---------------- WHATWEB ----------------
    print("\n[+] Detecting website technology...")
    whatweb_output = run([
        "whatweb",
        "-a", "1",
        url
    ])

    tech = parse_whatweb(whatweb_output)

    # ---------------- NMAP ----------------
    print("[+] Performing fast port scan...")
    nmap_output = run([
        "nmap",
        "-F",
        "--open",
        "-T4",
        domain
    ])

    ports = parse_nmap(nmap_output)

    # ---------------- NIKTO ----------------
    print("[+] Running quick vulnerability scan (max 120 sec)...")

    protocol = "https" if "443" in ports else "http"

    nikto_output = run([
        "nikto",
        "-h", f"{protocol}://{url}",
        "-Tuning", "123bde",
        "-maxtime", "120"
    ])

    issues = parse_nikto(nikto_output)

    # ---------------- SUMMARY ----------------
    print("\n==============================")
    print("        SCAN SUMMARY")
    print("==============================\n")

    print(f"Target Website : {url}")
    print(f"IP Address     : {ip}")
    print(f"Web Server     : {tech['server']}")
    print(f"PHP Version    : {tech['php']}")
    print(f"CMS Detected   : {'WordPress' if tech['wordpress'] else 'Not Detected'}")
    print(f"Open Ports     : {', '.join(ports) if ports else 'None'}")

    if "443" in ports:
        print("HTTPS Status   : Enabled ✅")
    else:
        print("HTTPS Status   : Not Enabled ⚠️")

    print(f"Findings Count : {issues}")

    # HUMAN REPORT
    explain_results(tech, ports, issues)

    print("✅ Scan Completed Successfully\n")

    # ---------------- JSON EXPORT ----------------
    save_choice = input("Do you want to download the results in JSON locally? (y/n): ").strip().lower()
    if save_choice == "y":
        now = datetime.now().strftime("%Y%m%d_%H%M%S")
        safe_target = re.sub(r"[^a-zA-Z0-9_-]", "_", url)
        filename = f"scan_result_{safe_target}_{now}.json"

        result_data = {
            "target": url,
            "domain": domain,
            "ip": ip,
            "server": tech["server"],
            "php": tech["php"],
            "wordpress": tech["wordpress"],
            "ports": ports,
            "https": "443" in ports,
            "findings_count": issues,
            "risk_level": "LOW" if issues < 5 else "MEDIUM" if issues < 15 else "HIGH",
            "timestamp": now
        }

        try:
            with open(filename, "w", encoding="utf-8") as f:
                json.dump(result_data, f, indent=2)
            print(f"📄 JSON result saved: {filename}\n")
        except Exception as e:
            print(f"❌ Could not save JSON file: {e}\n")
    else:
        print("ℹ️ JSON export skipped by user.\n")


# --------------------------------------------------
if __name__ == "__main__":
    main()
Enter fullscreen mode Exit fullscreen mode

Top comments (0)