DEV Community

Dex
Dex

Posted on

Build a Tech Stack Lead Enrichment Pipeline in Under 50 Lines of Python

Knowing a prospect runs Shopify vs Magento changes your entire pitch. Here's how to enrich your entire lead list with tech stack data in minutes.


Why tech stack data matters for sales

Before you send that outreach email, you should know:

  • Is this company a Shopify store or a custom-built platform? (Different pitch entirely)
  • Are they already using HubSpot, or do they need CRM tooling too?
  • Running WordPress or Webflow? That changes what integrations you can offer
  • Is their site on AWS, Vercel, or Heroku? Signals engineering maturity

This data used to require a Wappalyzer enterprise license ($250+/mo) or manual inspection for each lead. Now you can enrich an entire lead list in one batch API call.


The setup: TechStackDetect API

TechStackDetect is a REST API that analyzes any public website and returns the full technology stack:

  • 5,000+ technology signatures across 48 categories
  • Batch mode — analyze up to 50 URLs in one request
  • Confidence scores — know when a detection is definitive vs likely
  • Free tier — 100 requests/day, no credit card required

Step 1: Single URL test

import requests

API_KEY = "YOUR_RAPIDAPI_KEY"
HEADERS = {
    "X-RapidAPI-Key": API_KEY,
    "X-RapidAPI-Host": "techstackdetect1.p.rapidapi.com",
}

def detect(url: str) -> list[dict]:
    r = requests.post(
        "https://techstackdetect1.p.rapidapi.com/detect",
        json={"url": url},
        headers=HEADERS,
        timeout=30,
    )
    r.raise_for_status()
    return r.json().get("technologies", [])

techs = detect("https://woocommerce.com")
for t in techs[:5]:
    print(f"{t['name']} ({t['category']}) {t['confidence']:.0%}")
Enter fullscreen mode Exit fullscreen mode

Output:

WordPress (cms) 95%
WooCommerce (ecommerce) 95%
Nginx (server) 95%
jQuery (javascript) 95%
Yoast SEO Premium (seo) 75%
Enter fullscreen mode Exit fullscreen mode

Step 2: Batch enrichment — 50 URLs at once

def detect_batch(urls: list[str]) -> list[dict]:
    r = requests.post(
        "https://techstackdetect1.p.rapidapi.com/batch",
        json={"urls": urls},
        headers=HEADERS,
        timeout=60,
    )
    r.raise_for_status()
    return r.json()["results"]

lead_urls = [
    "https://woocommerce.com",
    "https://shopify.com",
    "https://bigcommerce.com",
]

results = detect_batch(lead_urls)
for r in results:
    ecom = [t["name"] for t in r["technologies"] if t["category"] == "ecommerce"]
    print(f"{r['url']}: {ecom[0] if ecom else 'Unknown'}") 
Enter fullscreen mode Exit fullscreen mode

Step 3: Full CSV enrichment pipeline

import csv, time

def enrich_leads(input_csv: str, output_csv: str):
    with open(input_csv) as f:
        rows = list(csv.DictReader(f))

    urls = [row["website"] for row in rows]
    results_map = {}

    for i in range(0, len(urls), 50):
        chunk = urls[i:i + 50]
        results = detect_batch(chunk)
        for r in results:
            results_map[r["url"]] = r.get("technologies", [])
        if i + 50 < len(urls):
            time.sleep(1)

    fieldnames = list(rows[0].keys()) + [
        "detected_cms", "detected_hosting", "detected_analytics",
        "detected_ecommerce", "tech_count"
    ]

    with open(output_csv, "w", newline="") as f:
        writer = csv.DictWriter(f, fieldnames=fieldnames)
        writer.writeheader()
        for row in rows:
            techs = results_map.get(row["website"], [])
            by_cat = {}
            for t in techs:
                by_cat.setdefault(t["category"], t["name"])
            row.update({
                "detected_cms": by_cat.get("cms", ""),
                "detected_hosting": by_cat.get("hosting", ""),
                "detected_analytics": by_cat.get("analytics", ""),
                "detected_ecommerce": by_cat.get("ecommerce", ""),
                "tech_count": len(techs),
            })
            writer.writerow(row)

    print(f"Enriched {len(rows)} leads → {output_csv}")

enrich_leads("leads.csv", "leads_enriched.csv")
Enter fullscreen mode Exit fullscreen mode

Before:

name,company,website
Acme Corp,acme.com,https://acme.com
Enter fullscreen mode Exit fullscreen mode

After:

name,company,website,detected_cms,detected_hosting,detected_analytics,detected_ecommerce,tech_count
Acme Corp,acme.com,https://acme.com,WordPress,WP Engine,Google Analytics,,8
Enter fullscreen mode Exit fullscreen mode

Step 4: Filter by confidence

def get_high_confidence(url: str, min_confidence: float = 0.90) -> list[str]:
    return [
        t["name"] for t in detect(url)
        if t["confidence"] >= min_confidence
    ]

print(get_high_confidence("https://stripe.com"))
# ['AWS', 'Nginx', 'HSTS', 'Next.js']
Enter fullscreen mode Exit fullscreen mode

Score thresholds: 0.95+ definitive · 0.90 strong · 0.75 probable. Use 0.85+ for lead scoring.


Step 5: Segment by platform for personalized outreach

def segment_leads(results):
    segments = {"shopify": [], "woocommerce": [], "magento": [], "custom": []}
    for r in results:
        ecom = next(
            (t["name"].lower() for t in r["technologies"] if t["category"] == "ecommerce"),
            None
        )
        key = ecom if ecom in segments else "custom"
        segments[key].append(r["url"])
    return segments
Enter fullscreen mode Exit fullscreen mode

Now your Shopify leads get the Shopify app pitch. WooCommerce leads get the WordPress plugin angle. Custom-built gets the enterprise message.


Pricing

Plan Price Batch limit
Free $0/mo 100 req/day
Pro $19/mo 5,000 req/day
Ultra $79/mo Unlimited

100 free requests = 2 batches of 50 leads. For a list of 500, Pro ($19/mo) handles it easily.

→ Get your free API key on RapidAPI


Summary

With 50 lines of Python and a free API key:

  1. Enrich any lead list with CMS, hosting, analytics, ecommerce data
  2. Process 50 leads per batch (parallel, fast)
  3. Filter by confidence for precision
  4. Segment for personalized outreach

Start free →

Top comments (0)