Automate Website Security Audits with Technology Detection in Python

#python #security #webdev #api

Knowing what technologies a website runs is the first step in any security assessment. Outdated CMS versions, exposed server headers, legacy JavaScript libraries — these are all attack vectors, and they're all detectable.

In this tutorial, we'll build a Python tool that scans any website, identifies its technology stack, and flags potential security concerns based on what it finds.

Why Technology Detection Matters for Security

Most security audits start with reconnaissance. Before testing for vulnerabilities, you need to know what's running. A site using WordPress 5.x has a different risk profile than one running Next.js on Vercel.

Manually checking this is tedious. Browser extensions work for one site at a time but don't scale. We'll automate it with an API that detects 141+ technologies programmatically.

Setup

We'll use the Technology Detection API on RapidAPI.

Subscribe to get your API key, then install dependencies:

pip install requests

Step 1: Detect Technologies on a Target

import requests

RAPIDAPI_KEY = "YOUR_RAPIDAPI_KEY"

def detect_technologies(url):
    resp = requests.get(
        "https://technology-detection-api.p.rapidapi.com/detect",
        params={"url": url},
        headers={
            "x-rapidapi-host": "technology-detection-api.p.rapidapi.com",
            "x-rapidapi-key": RAPIDAPI_KEY,
        },
    )
    resp.raise_for_status()
    return resp.json()

result = detect_technologies("https://example.com")
technologies = result.get("technologies", [])
print(f"Detected {len(technologies)} technologies")
for tech in technologies:
    name = tech.get("name") or tech.get("technology", "Unknown")
    category = tech.get("category", "Unknown")
    version = tech.get("version", "")
    ver_str = f" v{version}" if version else ""
    print(f"  [{category}] {name}{ver_str}")

This gives you a structured inventory of everything running on the target — CMS, server software, JavaScript libraries, analytics, CDN, and more.

Step 2: Define Security Rules

Now let's build a rules engine that flags technologies with known security implications:

# Technologies and patterns that warrant attention in a security audit
SECURITY_RULES = [
    {
        "match": lambda t: t.get("name", "").lower() == "wordpress",
        "severity": "medium",
        "message": "WordPress detected — verify version is current and plugins are updated. WordPress sites are the #1 target for automated attacks.",
    },
    {
        "match": lambda t: t.get("name", "").lower() == "jquery"
            and t.get("version", "").startswith(("1.", "2.")),
        "severity": "high",
        "message": "Outdated jQuery version detected ({version}). jQuery < 3.5.0 has known XSS vulnerabilities (CVE-2020-11022).",
    },
    {
        "match": lambda t: t.get("name", "").lower() == "jquery"
            and t.get("version", "").startswith("3."),
        "severity": "low",
        "message": "jQuery 3.x detected. Verify it's 3.5.0+ to avoid known XSS issues.",
    },
    {
        "match": lambda t: t.get("category", "").lower() == "cms",
        "severity": "medium",
        "message": "CMS detected: {name}. Ensure it's running the latest version with security patches applied.",
    },
    {
        "match": lambda t: t.get("name", "").lower() in ("php", "apache", "nginx")
            and t.get("version"),
        "severity": "medium",
        "message": "Server software version exposed: {name} {version}. Consider hiding version headers to reduce information leakage.",
    },
    {
        "match": lambda t: t.get("name", "").lower() in (
            "google analytics", "facebook pixel", "hotjar", "mixpanel"
        ),
        "severity": "info",
        "message": "Third-party tracker detected: {name}. Verify it's included in your privacy policy and cookie consent.",
    },
    {
        "match": lambda t: t.get("name", "").lower() == "bootstrap"
            and t.get("version", "").startswith(("2.", "3.")),
        "severity": "low",
        "message": "Outdated Bootstrap version ({version}). Older versions have known XSS vulnerabilities in tooltips/popovers.",
    },
    {
        "match": lambda t: t.get("name", "").lower() == "angular"
            and t.get("version", "").startswith("1."),
        "severity": "high",
        "message": "AngularJS 1.x detected. This version is end-of-life and no longer receives security patches.",
    },
]

Step 3: Run the Audit

Apply the rules against detected technologies and generate findings:

def run_security_audit(url):
    print(f"
{'='*60}")
    print(f"  SECURITY AUDIT: {url}")
    print(f"{'='*60}
")

    data = detect_technologies(url)
    technologies = data.get("technologies", [])

    if not technologies:
        print("  No technologies detected.")
        return []

    print(f"  Detected {len(technologies)} technologies
")

    findings = []
    for tech in technologies:
        for rule in SECURITY_RULES:
            if rule["match"](tech):
                name = tech.get("name") or tech.get("technology", "Unknown")
                version = tech.get("version", "N/A")
                message = rule["message"].format(
                    name=name, version=version
                )
                finding = {
                    "severity": rule["severity"],
                    "technology": name,
                    "version": version,
                    "message": message,
                }
                findings.append(finding)

    # Sort by severity
    severity_order = {"high": 0, "medium": 1, "low": 2, "info": 3}
    findings.sort(key=lambda f: severity_order.get(f["severity"], 99))

    if findings:
        for f in findings:
            icon = {
                "high": "[HIGH]",
                "medium": "[MED] ",
                "low": "[LOW] ",
                "info": "[INFO]",
            }.get(f["severity"], "[???]")
            print(f"  {icon} {f['message']}")
    else:
        print("  No security findings based on detected technologies.")

    print(f"
  Total findings: {len(findings)}")
    return findings

findings = run_security_audit("https://example.com")

Step 4: Batch Audit Multiple Sites

If you manage multiple websites or are auditing a client's portfolio:

import time

def batch_audit(urls):
    all_findings = {}
    for url in urls:
        try:
            findings = run_security_audit(url)
            all_findings[url] = findings
        except Exception as e:
            print(f"
  Error scanning {url}: {e}")
            all_findings[url] = []
        time.sleep(1)

    # Summary
    print(f"
{'='*60}")
    print(f"  BATCH AUDIT SUMMARY")
    print(f"{'='*60}")
    for url, findings in all_findings.items():
        high = sum(1 for f in findings if f["severity"] == "high")
        medium = sum(1 for f in findings if f["severity"] == "medium")
        domain = url.replace("https://", "").rstrip("/")
        print(f"  {domain}: {high} high, {medium} medium, "
              f"{len(findings)} total")

    return all_findings

sites = [
    "https://example.com",
    "https://yoursite.com",
    "https://clientsite.com",
]
batch_audit(sites)

Step 5: Export as JSON Report

For documentation or integration with other security tools:

import json
from datetime import datetime

def export_report(url, findings, filename=None):
    report = {
        "target": url,
        "scan_date": datetime.utcnow().isoformat(),
        "total_findings": len(findings),
        "severity_counts": {
            "high": sum(1 for f in findings if f["severity"] == "high"),
            "medium": sum(1 for f in findings if f["severity"] == "medium"),
            "low": sum(1 for f in findings if f["severity"] == "low"),
            "info": sum(1 for f in findings if f["severity"] == "info"),
        },
        "findings": findings,
    }

    if filename is None:
        domain = url.replace("https://", "").replace("/", "_").rstrip("_")
        filename = f"audit_{domain}_{datetime.now().strftime('%Y%m%d')}.json"

    with open(filename, "w") as f:
        json.dump(report, f, indent=2)

    print(f"
  Report exported to {filename}")
    return filename

export_report("https://example.com", findings)

Extending the Audit

This foundation can be extended in several practical ways:

Version database — maintain a mapping of technologies to their latest versions and flag anything outdated automatically
CVE lookup — cross-reference detected technology versions against the NIST NVD API for known vulnerabilities
Scheduled scans — run audits on a cron schedule and alert when a site's technology stack changes unexpectedly
CI/CD integration — add a post-deploy step that scans your own site and fails the pipeline if high-severity findings appear

Wrapping Up

Technology detection is a practical starting point for security reconnaissance. By combining the Technology Detection API with a simple rules engine, you get an automated audit tool that flags real concerns in seconds.

This doesn't replace a full penetration test, but it's a fast way to surface the low-hanging fruit — outdated libraries, exposed server versions, and forgotten third-party scripts.

Subscribe on RapidAPI and start scanning.

What security checks would you add to the rules engine? Share your ideas in the comments.