If you sell software, the technology your prospects use matters more than almost any other qualifying signal. A company running Shopify is a categorically different buyer than one running a custom Rails app. A team using HubSpot is further down the sales maturity curve than one using a spreadsheet.
This is called technographic data — knowing what technology stack a prospect uses — and it's how companies like Clearbit and ZoomInfo justify their price tags. The insight is simple: technology choices predict buying behavior, budget range, and fit for your product better than company size or industry alone.
This post shows you how to build a basic version of this yourself, using the Technology Detection API and about 100 lines of Python.
The Use Case: Qualifying Leads by Tech Stack
Let's make this concrete. Say you've built a Shopify plugin — maybe it handles subscription billing, or adds a custom reviews widget, or integrates with a specific 3PL. Your ideal customer is any store running Shopify.
The problem is that most lead lists don't come pre-tagged with "runs Shopify." You get a CSV of domains from a scrape, a trade show contact list, a LinkedIn export — and you have to figure out which ones are actually Shopify stores.
Manually checking 500 URLs is an afternoon of tedious work. Writing your own detector means maintaining fingerprint patterns as Shopify updates its platform. An API call per URL solves the problem cleanly.
Here's the basic pattern:
from techdetect import TechDetectClient
client = TechDetectClient(api_key="your_rapidapi_key")
def is_shopify(url: str) -> bool:
result = client.detect(url)
return any(
t.name == "Shopify" and t.confidence >= 80
for t in result.technologies
)
print(is_shopify("https://allbirds.com")) # True
print(is_shopify("https://techcrunch.com")) # False
That's it. Now let's scale it up.
Bulk Scanning a Lead List
Assume you have a CSV of prospect domains — call it leads.csv — with a domain column. Here's a script that scans all of them, tags each with their detected tech stack, and writes results to a new CSV.
import csv
import time
from techdetect import TechDetectClient
client = TechDetectClient(api_key="your_rapidapi_key")
TARGET_TECH = "Shopify"
CONFIDENCE_THRESHOLD = 80
def detect_with_retry(url: str, retries: int = 2):
for attempt in range(retries + 1):
try:
return client.detect(url)
except Exception as e:
if attempt < retries:
time.sleep(2 ** attempt)
else:
raise e
input_file = "leads.csv"
output_file = "leads_tagged.csv"
with open(input_file, newline="") as infile, open(output_file, "w", newline="") as outfile:
reader = csv.DictReader(infile)
fieldnames = reader.fieldnames + ["is_target", "detected_technologies", "confidence"]
writer = csv.DictWriter(outfile, fieldnames=fieldnames)
writer.writeheader()
for row in reader:
domain = row["domain"].strip()
url = domain if domain.startswith("http") else f"https://{domain}"
try:
result = detect_with_retry(url)
target_tech = next(
(t for t in result.technologies if t.name == TARGET_TECH),
None
)
row["is_target"] = "yes" if (target_tech and target_tech.confidence >= CONFIDENCE_THRESHOLD) else "no"
row["detected_technologies"] = ", ".join(t.name for t in result.technologies)
row["confidence"] = target_tech.confidence if target_tech else 0
except Exception as e:
row["is_target"] = "error"
row["detected_technologies"] = str(e)
row["confidence"] = 0
writer.writerow(row)
print(f"{domain}: {row['is_target']}")
Run this against a 500-row lead list overnight, and you wake up to a pre-qualified CSV you can import directly into your CRM.
Building a Lead Scoring System
Binary filtering (Shopify / not Shopify) is just the start. The more interesting approach is lead scoring — ranking leads by how well their tech stack matches your ideal customer profile.
The intuition: a Shopify store using Stripe, Klaviyo, and Google Analytics is more sophisticated — and likely higher-revenue — than one using only basic Shopify with no third-party tooling.
Here's a simple scoring model:
from techdetect import TechDetectClient
client = TechDetectClient(api_key="your_rapidapi_key")
# Define scoring rules: (technology_name, points, reason)
SCORE_RULES = [
("Shopify", 30, "core platform match"),
("Shopify Plus", 25, "high-revenue indicator"),
("Stripe", 10, "payment sophistication"),
("Klaviyo", 10, "email marketing investment"),
("Google Analytics", 5, "tracking maturity"),
("Hotjar", 8, "CRO investment"),
("Yotpo", 8, "reviews platform — growing brand"),
("Recharge", 15, "subscription billing — recurring revenue"),
("Gorgias", 10, "customer support investment"),
("Loop Returns", 10, "returns management — scale indicator"),
]
def score_lead(url: str) -> dict:
result = client.detect(url)
tech_names = {t.name for t in result.technologies}
score = 0
matched_signals = []
for tech_name, points, reason in SCORE_RULES:
if tech_name in tech_names:
score += points
matched_signals.append(f"+{points} {tech_name} ({reason})")
tier = (
"hot" if score >= 60 else
"warm" if score >= 35 else
"cold" if score >= 15 else
"disqualified"
)
return {
"url": url,
"score": score,
"tier": tier,
"signals": matched_signals,
"all_technologies": [t.name for t in result.technologies],
}
# Example
result = score_lead("https://somestore.com")
print(f"Score: {result['score']} — {result['tier'].upper()}")
for signal in result["signals"]:
print(f" {signal}")
A store scoring 60+ (Shopify Plus + Recharge + Klaviyo + Gorgias) is a high-intent, high-budget lead. One scoring 15 (bare Shopify, nothing else) might be too early-stage to convert.
The Cost Math
This kind of enrichment used to require either a Clearbit subscription ($X,XXX/month) or a ZoomInfo seat (similar ballpark). For a developer at a small SaaS company, those price points are hard to justify before you've validated the channel.
Here's what this approach costs with the Technology Detection API:
| Volume | Plan | Cost |
|---|---|---|
| Up to 100 URLs/month | Free tier | $0 |
| Up to 2,000 URLs/month | Pro | $9/month |
| Up to 10,000 URLs/month | Ultra | ~$29/month |
Scanning 2,000 leads per month for $9 is a reasonable budget for almost any growth experiment. If you close even one deal from a batch of enriched leads, the ROI is obvious.
Next Steps
The script above is a starting point. A few directions worth extending:
- Add CRM integration — POST scored leads directly to a HubSpot or Pipedrive custom property instead of writing a CSV
- Schedule it as a cron job — Run the scan weekly against new leads as they enter your pipeline
- Expand the scoring model — Add negative signals (e.g., "uses WooCommerce" scores negative if you only support Shopify)
- Filter by geography — Combine with a WHOIS or IP geolocation lookup to target specific markets
The Python client and full source: github.com/dapdevsoftware/techdetect-python
Get an API key (free, no credit card required): Technology Detection API on RapidAPI
If you build something with this — especially if you adapt the scoring model for a specific niche — I'd be curious to hear what signals turned out to be most predictive.
Top comments (0)