DEV Community

MikeL
MikeL

Posted on • Originally published at detectzestack.com

Build a Website Tech Stack Scanner in Python (Under 50 Lines)

Ever wonder what tech stack a website is running? Maybe you're scoping out a competitor, enriching leads, or checking for outdated frameworks with known vulnerabilities.

Here's a Python script that does it in under 50 lines. It calls a tech detection API, parses the response, and gives you a clean report.

The Full Script

import requests
import sys

API_URL = "https://detectzestack.p.rapidapi.com/analyze"
HEADERS = {
    "X-RapidAPI-Key": "YOUR_API_KEY",
    "X-RapidAPI-Host": "detectzestack.p.rapidapi.com"
}

def scan(url):
    resp = requests.get(API_URL, headers=HEADERS, params={"url": url})
    resp.raise_for_status()
    return resp.json()

def report(data):
    print(f"\n{'=' * 50}")
    print(f"  {data['domain']}  (HTTP {data['status_code']})")
    print(f"  Scanned in {data['detection_time_ms']}ms")
    print(f"{'=' * 50}")

    # Group by category
    by_cat = {}
    for tech in data["technologies"]:
        cat = tech["category"]
        by_cat.setdefault(cat, []).append(tech)

    for cat, techs in sorted(by_cat.items()):
        print(f"\n  {cat}:")
        for t in techs:
            ver = f" v{t['version']}" if t.get("version") else ""
            src = f"[{t['source']}]"
            cpe = f"\n      CPE: {t['cpe']}" if t.get("cpe") else ""
            print(f"    - {t['name']}{ver} {src}{cpe}")

    print(f"\n  Total: {len(data['technologies'])} technologies detected")

if __name__ == "__main__":
    urls = sys.argv[1:] or ["stripe.com"]
    for url in urls:
        report(scan(url))
Enter fullscreen mode Exit fullscreen mode

Save this as scan.py and run it:

python scan.py stripe.com github.com shopify.com
Enter fullscreen mode Exit fullscreen mode

Sample Output

Here's what you get for stripe.com:

==================================================
  stripe.com  (HTTP 200)
  Scanned in 847ms
==================================================

  CDN:
    - Cloudflare [dns]

  JavaScript frameworks:
    - React [wappalyzer]

  Web servers:
    - Nginx v1.25.3 [wappalyzer]
      CPE: cpe:2.3:a:f5:nginx:1.25.3:*:*:*:*:*:*:*

  Analytics:
    - Google Analytics [wappalyzer]

  SSL/TLS certificate authorities:
    - Let's Encrypt [tls]

  Total: 5 technologies detected
Enter fullscreen mode Exit fullscreen mode

What's Happening Under the Hood

The API uses four detection layers:

1. HTTP Header Fingerprinting
Headers like Server: nginx/1.25.3 and X-Powered-By: Express reveal the web server and framework. Fast, but many production sites strip these.

2. HTML/DOM Pattern Matching
The API scans page source for known patterns: <meta name="generator" content="WordPress">, script URLs containing react.production.min.js, CSS class patterns like Tailwind's utilities. This is the core engine — over 7,200 technology signatures via wappalyzergo.

3. DNS CNAME Analysis
A DNS lookup reveals infrastructure that HTTP can't see. example.com CNAME d1234.cloudfront.net → Amazon CloudFront. The API checks against 111 CDN/hosting provider signatures.

4. TLS Certificate Inspection
The certificate authority correlates with infrastructure: Let's Encrypt → self-hosted, Cloudflare Inc → Cloudflare proxy, Amazon → AWS.

Each technology in the response includes a source field (wappalyzer, dns, tls, or headers) so you know exactly how it was detected.

Extending It: Batch Analysis

Need to scan multiple URLs efficiently? The API has a batch endpoint:

import requests

urls = ["stripe.com", "shopify.com", "github.com",
        "notion.so", "vercel.com"]

resp = requests.post(
    "https://detectzestack.p.rapidapi.com/batch",
    headers={
        "X-RapidAPI-Key": "YOUR_API_KEY",
        "X-RapidAPI-Host": "detectzestack.p.rapidapi.com",
        "Content-Type": "application/json"
    },
    json={"urls": urls}
)

for result in resp.json()["results"]:
    techs = [t["name"] for t in result["technologies"]]
    print(f"{result['domain']}: {', '.join(techs)}")
Enter fullscreen mode Exit fullscreen mode

This sends one HTTP request instead of five.

Extending It: Security Scanning with CPE

The cpe field in the response maps detected technologies to the NVD (National Vulnerability Database). You can cross-reference it to check for known CVEs:

for tech in data["technologies"]:
    if tech.get("cpe"):
        print(f"  {tech['name']}{tech['cpe']}")
        # Query NVD: https://services.nvd.nist.gov/rest/json/cves/2.0?cpeName={cpe}
Enter fullscreen mode Exit fullscreen mode

This is useful for security audits and compliance checks.

Getting an API Key

The API is on RapidAPI. The free tier gives you 100 requests/month — no credit card required. Enough to build and test with.

Paid plans start at $9/month for 1,000 requests if you need more volume.


What are you building with tech detection? Competitive analysis dashboards? Lead enrichment pipelines? Security scanners? Drop a comment — I'd love to hear your use case.

Top comments (0)