DEV Community

NexGenData
NexGenData

Posted on

How B2B SaaS Teams Use Tech Stack Data to 10x Their Cold Outbound Reply Rate

How B2B SaaS Teams Use Tech Stack Data to 10x Their Cold Outbound Reply Rate

Here is the uncomfortable arithmetic of cold outbound in 2026.

A senior SDR at a Series B SaaS company sends 80 emails per day. Industry-standard reply rate for unsegmented, generic outbound is 1.0–1.5%. That is roughly one reply per business day per SDR. Not one meeting — one reply. Many of those replies are "unsubscribe."

Replies that mention a tool by name — "we already use HubSpot" or "we just moved off Marketo" — convert to meetings at roughly 18%. Replies to generic emails convert at roughly 6%. The difference between a sequenced SDR who knows what their prospect runs and one who doesn't is not 2x. It is closer to 8–12x in booked meetings per email sent, once you compound reply rate against meeting-conversion-rate.

This post is the operational playbook for closing that gap. It assumes you run a cold outbound motion of 1,000–50,000 emails per month, that you have a CRM (HubSpot, Salesforce, Outreach, Apollo, Salesloft — pick your poison), and that you are tired of paying $295/month for BuiltWith Pro to learn that your prospect runs WordPress.

Why "what tech do they run" is the highest-leverage personalization signal

Personalization is a spectrum. At the bottom is Hi {first_name} and I see you work at {company}. Those are negative-personalization — they tell the prospect that you bought a list and ran a mail merge. At the top is "I noticed you posted on r/dataengineering last week about Snowflake costs; here is exactly how we cut Snowflake spend by 40% for $similar_company." That is high-personalization, but it does not scale past ~15 emails per SDR per day.

Tech stack sits in the sweet spot. It is automatable at list-build time, it is true (not inferred from a noisy LinkedIn parse), and it maps directly to the buying-intent question a B2B vendor cares about: do they have the problem we solve, do they own a competing tool, and what is the swap cost?

Three concrete moves the data unlocks:

  1. "We integrate with X" cold opens. If you sell a Stripe Tax product and you can prove the prospect runs Stripe (Stripe.js loads in the page, the Stripe pricing page is in their navigation, a Stripe-hosted checkout is referenced), the cold open writes itself. "Saw you're on Stripe — most Stripe customers we work with were paying TaxJar $4-6k/year before they consolidated onto our product. Want me to send the math?" That email gets replied to.
  2. "We replace Y" cold opens. If you sell a Segment competitor and you can detect Segment's analytics.js on the prospect's site, you have not only a target — you have the migration story already half-written. "We see Segment loaded on yourdomain.com. Three of our last five customers came from Segment after the renewal hit $50k. Open to a 15-minute migration cost compare?"
  3. Stack-based ICP filtering. Rather than firing a Stripe-integration pitch at every prospect, you only sequence prospects who actually run Stripe. List quality goes up by 5–20x, list size goes down accordingly, and reply rate climbs because every email is relevant.

This third move is where the compounding really happens. Most outbound teams over-target on firmographics (industry, employee count, geography) and under-target on technographics (what they run). Firmographics are necessary but coarse. Technographics are precise.

Where the data comes from

Wappalyzer-style fingerprinting detects what is loaded on a website by inspecting four signal classes:

  • HTTP response headers. Server: Apache, X-Powered-By: Express, X-Shopify-Stage: production, cf-ray: for Cloudflare.
  • HTML patterns. <meta name="generator" content="WordPress 6.4">, link tags pointing to /wp-content/, script tags loading analytics.tiktok.com/i18n/pixel/events.js.
  • JavaScript globals after page load. window.Shopify, window.dataLayer, window.__NEXT_DATA__, window.Intercom.
  • Resource URLs and DOM patterns. A <link rel="stylesheet" href="https://cdn.shopify.com/..."> is dispositive of Shopify; class="hs-cta-wrapper" is dispositive of HubSpot CTAs.

Doing this well requires a real headless browser (Puppeteer or Playwright), a maintained fingerprint ruleset (the tunetheweb/wappalyzer fork has 251 fingerprints as of Q2 2026), and infrastructure to handle anti-bot defenses. Doing it badly — curl + grep — gets you maybe 30% of the actual stack on a JS-heavy site.

The economically interesting alternatives:

Source Cost per 10k domains Coverage JS execution Notes
BuiltWith Pro $295/month flat (limited lookups) ~70k fingerprints Limited Best historical data, expensive at volume
Wappalyzer Enterprise $250–5,000/month ~2.7k fingerprints Yes Closed-source since 2023
HG Insights Enterprise contract ($50k+/yr) Proprietary Yes Optimized for ABM, gated
Clearbit Enrichment $99–$999/month + per-call ~1k fingerprints Limited Bundled with firmographics
Self-hosted Wappalyzer fork ~$200 of compute 251 OSS fingerprints DIY (your problem) Free if you have a Chrome fleet
wappalyzer-replacement actor $100 ($0.01 × 10k) 251 OSS fingerprints + custom Yes (Puppeteer) Pay per detection, no contract

For an SDR team running 5,000–20,000 prospects a month, the per-detection model wins on every axis except historical data — and the SDR use case rarely needs historical data. You want now, not what they ran in 2021.

The reference implementation we use throughout this post is the wappalyzer-replacement actor on Apify. It bundles the OSS Wappalyzer ruleset, runs Puppeteer with anti-bot defenses, and returns a JSON list of detected technologies per URL at $0.01 per detection.

The pipeline

Prospect list (CSV of domains)
        |
        v
Tech stack enrichment (per-domain JSON of detected technologies)
        |
        v
Stack-based segmentation (split into N sub-lists by stack)
        |
        v
Per-segment sequence (cold open mentions specific tool)
        |
        v
CRM (Outreach / Salesloft / Apollo / HubSpot)
Enter fullscreen mode Exit fullscreen mode

The whole thing is 80–150 lines of Python. Here is the minimum viable version.

Step 1: Build the prospect list

Get a list of domains. The source does not matter — Apollo export, Sales Navigator scrape, ZoomInfo CSV, Crunchbase pull, hand-built list from a conference attendee directory. The schema you need is just company_name, domain to start.

import csv

def load_prospects(path: str) -> list[dict]:
    with open(path) as f:
        return list(csv.DictReader(f))

prospects = load_prospects("apollo_export.csv")
print(f"{len(prospects)} prospects to enrich")
Enter fullscreen mode Exit fullscreen mode

Step 2: Enrich with tech stack

Call the actor with the list of URLs. The actor accepts an array of URLs and returns one record per URL with the detected stack.

import os
from apify_client import ApifyClient

client = ApifyClient(os.environ["APIFY_TOKEN"])

def enrich_with_stack(prospects: list[dict]) -> list[dict]:
    urls = [f"https://{p['domain']}" for p in prospects]
    run = client.actor("nexgendata/wappalyzer-replacement").call(
        run_input={"urls": urls, "render_js": True}
    )
    items = list(client.dataset(run["defaultDatasetId"]).iterate_items())
    by_url = {item["url"]: item["technologies"] for item in items}
    for p in prospects:
        p["tech_stack"] = by_url.get(f"https://{p['domain']}", [])
        p["tech_names"] = [t["name"] for t in p["tech_stack"]]
    return prospects

enriched = enrich_with_stack(prospects)
Enter fullscreen mode Exit fullscreen mode

A typical record after enrichment:

{
  "company_name": "Acme Coffee",
  "domain": "acmecoffee.com",
  "tech_stack": [
    {"name": "Shopify", "version": null, "categories": ["Ecommerce"], "confidence": 100},
    {"name": "Klaviyo", "version": null, "categories": ["Marketing automation"], "confidence": 100},
    {"name": "Gorgias", "version": null, "categories": ["Live chat", "Help desk"], "confidence": 100},
    {"name": "Google Tag Manager", "version": null, "categories": ["Tag managers"], "confidence": 100},
    {"name": "Stripe", "version": null, "categories": ["Payment processors"], "confidence": 100}
  ],
  "tech_names": ["Shopify", "Klaviyo", "Gorgias", "Google Tag Manager", "Stripe"]
}
Enter fullscreen mode Exit fullscreen mode

That is the data your SDR has been missing. From here, the entire outbound motion changes.

Step 3: Segment by stack

The segmentation rule depends on what you sell. A few common shapes:

  • "They run X, pitch integration": Filter where "Stripe" in p["tech_names"], sequence with the Stripe-integration cold open.
  • "They run Y, pitch replacement": Filter where "Segment" in p["tech_names"], sequence with the Segment-replacement cold open.
  • "They run X but not Y, pitch addition": Filter where "Shopify" in p["tech_names"] and "Klaviyo" not in p["tech_names"], sequence with "you're on Shopify but not on Klaviyo, here's the playbook."
  • "They run nothing in our category, pitch first-time-buyer": Filter where not any(t in p["tech_names"] for t in COMPETITORS).
SEGMENTS = {
    "stripe_integration_play": lambda p: "Stripe" in p["tech_names"],
    "segment_replacement_play": lambda p: "Segment" in p["tech_names"],
    "shopify_no_klaviyo_play": lambda p: "Shopify" in p["tech_names"] and "Klaviyo" not in p["tech_names"],
    "no_competitor_play": lambda p: not any(t in p["tech_names"] for t in {"Mixpanel", "Amplitude", "Heap"}),
}

segmented = {name: [p for p in enriched if pred(p)] for name, pred in SEGMENTS.items()}
for name, lst in segmented.items():
    print(f"{name}: {len(lst)} prospects")
Enter fullscreen mode Exit fullscreen mode

Step 4: Per-segment sequences

Each segment gets its own cold-email sequence in your outbound tool. The first email in each sequence references the specific technology. Subsequent emails can be the same boilerplate, but the open carries the personalization weight.

Apollo, Outreach, and Salesloft all expose CRUD APIs for adding contacts to sequences. The Apollo flow:

import httpx

APOLLO_API_KEY = os.environ["APOLLO_API_KEY"]

def push_to_sequence(prospects: list[dict], sequence_id: str):
    with httpx.Client(headers={"X-API-Key": APOLLO_API_KEY}) as client:
        for p in prospects:
            r = client.post(
                "https://api.apollo.io/v1/contacts",
                json={
                    "first_name": p["first_name"],
                    "last_name": p["last_name"],
                    "organization_name": p["company_name"],
                    "website_url": p["domain"],
                    "custom_fields": {"detected_stack": ", ".join(p["tech_names"])},
                },
            )
            contact_id = r.json()["contact"]["id"]
            client.post(
                f"https://api.apollo.io/v1/emailer_campaigns/{sequence_id}/add_contact_ids",
                json={"contact_ids": [contact_id]},
            )

for segment_name, prospects in segmented.items():
    push_to_sequence(prospects, SEQUENCE_IDS[segment_name])
Enter fullscreen mode Exit fullscreen mode

Step 5: Make the email reference the actual stack

The cold open is where the work pays off. A few templates that consistently outperform generic copy in our customers' data:

Integration play (target runs a tool you integrate with):

Subject: {their_tool} + {your_tool}

Hi {first_name} — saw you're running {their_tool} on {company_domain}. Most {their_tool} teams we work with at the {employee_count} stage end up wiring it into {your_tool} for {specific_value_prop}. Last customer who did this saved {dollar_figure} in the first quarter.

Worth a 15-minute look at how the integration runs?

Replacement play (target runs a competitor):

Subject: {competitor} renewal coming up?

Hi {first_name} — noticed {competitor} is on {company_domain}. Three of our last five {industry} customers migrated from {competitor} when their renewal hit {price_point}. Migration is usually 2-3 weeks; we eat the dual-running cost.

Want the customer-references doc?

Stack-gap play (target runs the foundation but not the next layer):

Subject: You're on {foundation}, missing {layer}

Hi {first_name} — saw you're on {foundation} but not yet on a {category} layer. The {category} ROI on {foundation} is usually 4-8x within 90 days because of {specific_reason}. Happy to send the playbook our customers use.

Each of these is dramatically less generic than Hi {first_name}, hope you're well — and the personalization is automated from the technographic enrichment, so your SDR did not have to do any per-prospect research.

What this costs

Three real-world cost scenarios.

Scenario A: Mid-market SaaS, 5,000 prospects/month.

  • Enrichment: 5,000 × $0.01 = $50/month
  • Outbound tool: Apollo Pro, $99/seat/month × 3 SDRs = $297
  • Total per-prospect cost: ~$0.07
  • Equivalent BuiltWith Pro plan: $495/month (10k lookups), 10x the enrichment cost

Scenario B: Enterprise SaaS, 20,000 prospects/month.

  • Enrichment: 20,000 × $0.01 = $200/month
  • Outbound tool: Outreach Standard, $130/seat/month × 8 SDRs = $1,040
  • Total per-prospect cost: ~$0.06
  • Equivalent BuiltWith Pro Enterprise plan: $995+/month for comparable volume

Scenario C: Lean B2B startup, 1,000 prospects/month.

  • Enrichment: 1,000 × $0.01 = $10/month
  • Outbound tool: Apollo Basic, $49/seat/month × 1 SDR = $49
  • Total per-prospect cost: ~$0.06
  • Equivalent BuiltWith plan: $295/month minimum

In every scenario, technographic enrichment is the cheapest line item in the outbound stack. There is no excuse not to do it.

Re-enrichment cadence

Tech stacks change. A prospect that was on Marketo six months ago might be on HubSpot today. A prospect on Shopify might have migrated to Shopify Plus. The actionable rule:

  • Re-enrich active prospects every 60 days. Stack churn is roughly 8–12% per quarter for the relevant tools.
  • Re-enrich before a renewal-window outreach. If you have intel that the prospect's renewal is in Q3, re-enrich in late Q2.
  • Re-enrich after a funding event. Companies that raised a Series A in the last 30 days are 4-7x more likely to have changed tools in the next 90 days. (See the funding-driven outbound playbook for the full pipeline.)

A nightly cron that re-enriches the top 200 accounts per SDR is roughly $60/month of compute and pays for itself the first time your SDR catches a Marketo-to-HubSpot migration before your competitor does.

What about firmographics?

Tech stack is the personalization signal. Firmographics (industry, employee count, geography, revenue band) are still the targeting signal. The right pipeline does both:

Apollo / ZoomInfo / Sales Nav  →  firmographic filter (industry, headcount)
                                        |
                                        v
                  wappalyzer-replacement →  technographic filter (stack)
                                        |
                                        v
                           Per-segment outbound sequence
Enter fullscreen mode Exit fullscreen mode

A typical funnel: 50,000 firmographic candidates → 5,000 ICP-matched (10% pass rate on firmographics) → 1,200 technographic-matched into one of N segments (24% pass rate on stack filters) → 8% reply rate on personalized emails → 96 replies → 17 booked meetings.

Compare to the unsegmented version: 5,000 firmographic-matched contacts → 1.2% reply rate on generic emails → 60 replies → 4 booked meetings.

Same input, 4x the output, $50 of enrichment.

Common mistakes

Sequencing every prospect through every segment. If a prospect runs both Stripe and Segment, do not send them both the Stripe-integration email and the Segment-replacement email. Build a priority order in your segmentation rules and assign each prospect to exactly one sequence.

Treating low-confidence detections as fact. The actor returns a confidence score (0–100). Filter to confidence >= 80 before sequencing on a detection. A 50% confidence Klaviyo detection means "we found a Klaviyo-shaped pattern but cannot confirm" — don't write your cold open as if it is gospel.

Ignoring the "what they don't run" signal. Half the value of stack data is the gap, not the presence. "They run Shopify but not Klaviyo" is just as actionable as "they run Klaviyo." Build your segmentation rules to use both.

Stale stack data driving stale pitches. The sales team that pitches "we replace Marketo" to a prospect who migrated off Marketo six months ago looks worse than the team that pitches generically. Re-enrich.

Pitching integrations the prospect's stack cannot support. "We integrate with HubSpot" is not useful to a prospect on Salesforce. Make sure your segmentation actually filters on the integration target, not the broader category.

A note on data freshness

The actor fingerprints in real time, on demand. Each run pulls the live page and runs the fingerprint script in the page context. There is no vendor cache, which means there is no staleness — but it also means each run costs a detection.

For sales workflows, this is the right trade-off. You want the stack the prospect runs today, not the stack they ran the last time BuiltWith crawled them (which may have been months ago for long-tail domains). The price of fresh data is one detection per query, and at $0.01, that price is correct.

Putting it together

The cold-outbound performance gap between teams that segment by stack and teams that don't is not a 10% gain. It is a structural advantage. The teams who built this pipeline three years ago using BuiltWith Enterprise contracts are now competing against teams who built the same thing for $50/month using OSS-fingerprint actors. The tooling has democratized; the question is whether your SDRs are using it.

If you want to skip the build, the wappalyzer-replacement actor handles the enrichment step end to end. Feed in domains, get back a Wappalyzer-format JSON list of detected technologies per URL. Wire it into your CRM and your outbound tool with the 80 lines of Python above.


NexGenData publishes 195+ actors for B2B sales workflows: tech-stack detection, contact discovery, firmographic enrichment, and CRM data hygiene. All pay-per-result, no contracts.

Related actors for the SDR pipeline:

  • contact-info-scraper — pulls email, phone, and team-member contact info from any company website. Pairs with stack enrichment to turn a domain into a contactable lead.
  • lead-list-enricher — bulk firmographic + technographic enrichment for existing prospect lists. Feed in company names, get back domain + stack + employee estimates.
  • company-data-aggregator — eight-source OSINT aggregator (WHOIS, DNS, CT logs, GitHub, npm, tech headers) for full company profiles when you need more than just stack.

Top comments (0)