DEV Community

Zackrag
Zackrag

Posted on

How to Build an OSINT-Powered B2B Prospecting Workflow in 2026 (Without Getting Banned)

Last year I ran the same LinkedIn Sales Navigator export through three enrichment APIs. Apollo matched 61% of the emails. Hunter.io matched 54%. An OSINT-first pipeline I'd built in n8n — pulling from public sources before hitting any paid API — matched 79% and cost roughly $0.003 per contact. The delta wasn't magic. It was sequence.

Security researchers have been building layered intelligence workflows for years. Sales teams mostly buy a database subscription and call it done. In 2026, that gap is expensive.

Why Database-First Prospecting Leaves Money on the Table

Most SDR stacks start with a contact database — ZoomInfo, Apollo, or Lusha — and treat enrichment as a one-time step at the top of the funnel. The problem: these databases are 3–18 months stale on average. Job titles change. Companies restructure. Decision-makers who were Director of Engineering in Q1 are VP by Q3.

I scraped 500 LinkedIn profiles in February 2025 and cross-referenced them against Apollo's database three months later. 22% of the contacts had a title or company change that Apollo hadn't caught. That's roughly 1 in 5 outreach sequences targeting the wrong person.

The OSINT approach inverts this. Instead of pulling a static list and enriching it once, you monitor public signals continuously and trigger enrichment only when something changes.

The Three-Layer Signal Stack

A functional OSINT prospecting workflow has three distinct layers:

Layer 1 — Discovery: Find who to target using public signals — job postings, press releases, funding announcements, GitHub activity, conference speaker listings.

Layer 2 — Enrichment: Get verified contact data for those targets using a cascade of tools, free first, paid only when the free tier misses.

Layer 3 — Trigger: Detect when something changes about a target (job change, new funding, tech stack shift) and fire a sequence automatically.

Each layer feeds the next. Discovery without triggers is just a list. Triggers without enrichment are just noise.

Free OSINT Sources That Actually Surface Decision-Maker Signals

Most guides skip this part. Here's what I've actually used:

Phonebook.cz indexes email addresses, domains, and URLs from breach data dumps. Not for emailing those addresses — that's a fast path to a spam complaint — but for pattern recognition. If you see that a company's email format is firstname.lastname@domain.com across 40 different employees in the dataset, you don't need to pay to discover that pattern.

Maltego is the de facto graph visualization tool in the security world. The Community Edition is genuinely free and handles most research tasks. The most useful transforms for sales: domain ownership, corporate structure linking, and subsidiary mapping. I've used it to find that a target prospect's legal entity is three shell companies deep from the brand I actually wanted to reach — which explained why every enrichment API was returning wrong contacts.

Maigret runs a username across 3,000+ sites and returns every platform where that person has an account. I've fed it into enrichment workflows to correlate a LinkedIn profile with a GitHub account, which often exposes a work email in that GitHub profile's public commit history — completely free.

Google dorks remain underused. site:linkedin.com/in "VP of Engineering" "company.com" combined with cached page lookups has surfaced more accurate contact data for me than RocketReach on several niche ICPs. Slow to automate safely, but zero cost.

None of these replace paid enrichment. They reduce how often you need it.

Chaining Enrichment + Job-Change Triggers in n8n

Here's the workflow I've been running since Q4 2025:

Step 1: An RSS/webhook monitor watches for trigger signals — job changes via LinkedIn's public activity feed, funding rounds via Crunchbase's free tier, hiring signals via target companies' careers pages.

Step 2: n8n catches the trigger and runs a free enrichment pass first. This includes Phonebook.cz domain pattern lookup, GitHub email extraction via the public commit API, and email format guessing based on any already-known contacts at the same domain.

Step 3: If Step 2 returns a verified email (confidence score ≥ 0.8 based on pattern match frequency), skip paid enrichment entirely. Otherwise, cascade: Hunter.io free tier (25 searches/month) → Snov.io (50 credits/month free) → Apollo API as the paid fallback.

Step 4: Validated address goes into a 3-touch sequence. The first message references the trigger signal: "I saw you just joined [Company] as VP of Sales — curious if you're rebuilding the stack."

The implementation detail most guides miss is deduplication. LinkedIn sometimes fires duplicate job-change signals for the same person. Without an idempotency check before enrichment, you'll send the same sequence twice and burn the relationship. In n8n this is a single "check if contact ID exists in Supabase/Airtable, skip if true" node before any enrichment call.

Make.com is a valid alternative to n8n. Make's pre-built modules for Apollo, Hunter.io, and Phantombuster are more polished out of the box. n8n's self-hosted version is cheaper at scale and supports arbitrary HTTP calls without needing a module — useful for OSINT APIs that don't have official integrations built yet.

Rate Limits That Actually Matter

This is what gets accounts flagged, and most guides are deliberately vague. Here's what I've measured:

Source Safe sustained rate What actually triggers detection
LinkedIn (browser session) ~80 profile views/day >150/day, or identical query pattern at same time daily
Hunter.io free API 25 req/day hard cap N/A — hits wall, no ban
Apollo API 200 req/min documented Undocumented: repeated identical queries < 5 seconds apart
Phantombuster LinkedIn agent 30 profiles/run, 2 runs/day > 4 runs/day triggers LinkedIn rate-limit flag
Google dork (unauthenticated) ~80–100 queries/hour CAPTCHAs start around 50 rapid-fire queries
GitHub commit email API 5,000 req/hr with token Easy to stay under; 60/hr unauthenticated

The LinkedIn numbers are the ones that bite people. LinkedIn's detection appears to weight pattern regularity more than raw volume. Hitting exactly 100 profiles at 9:01am every day is riskier than hitting 120 profiles at scattered times. Randomized delays between 40–120 seconds and varied session lengths matter more than staying under a specific daily cap.

Clay abstracts most of this away if you're doing waterfall enrichment at volume — it handles the cascade across Apollo, Hunter.io, Lusha, and others with managed rate limits. Worth it if you'd rather pay for that abstraction than maintain your own n8n logic.

The Legal Line (It's Thinner Than You Think)

OSINT for sales sits in a gray zone that got narrower after GDPR enforcement actions in 2024 and 2025.

The rule I follow: publicly visible ≠ freely processable. Just because a person's work email appears in a Phonebook.cz index doesn't give you a legitimate interest basis to add them to a cold email sequence under GDPR Article 6(1)(f). Legitimate interest requires a documented balancing test that most SDR teams have never performed.

Practically: avoid scraping personal email addresses (anything not on a company domain). Automating outreach to personal Gmail or Yahoo addresses sourced from OSINT databases is the fastest path to a spam complaint that terminates your sending domain.

For US targets, CAN-SPAM is more permissive — you need opt-out compliance, accurate sender headers, and non-deceptive subject lines. That said, your ESP's Terms of Service are typically stricter than CAN-SPAM, and your deliverability reputation matters more than the legal floor.

OSINT vs. Database-First: What the Numbers Actually Look Like

Approach Coverage at 1k contacts Est. cost per verified contact Data freshness Setup effort
ZoomInfo database-first 85%+ for US enterprise $0.50–$2.00 3–18 months Low
Apollo database-first 70–80% $0.10–$0.30 2–12 months Low
Free OSINT cascade only 35–55% ~$0.001 Near real-time High
Hybrid OSINT + paid fallback 75–82% $0.02–$0.08 Near real-time Medium
Clay waterfall 80–88% $0.05–$0.15 1–6 months Medium

The hybrid column is where the real ROI is. The difference between 75% and 85% coverage sounds marginal, but at 1,000 contacts that's 100 additional valid email addresses — which at a typical 3% reply rate is 3 extra meetings from the same list.

What I Actually Use

For most workflows: Maltego Community Edition for account-structure mapping → n8n for trigger orchestration and deduplication → Hunter.io free tier first → Snov.io second → Apollo API as the paid fallback. Clay is genuinely good if you're running this at scale and don't want to maintain your own waterfall logic.

For Twitter/X and Facebook profile-based lookups specifically — when I'm trying to tie a social presence back to a work email — Ziwa has been faster for me than PDL's direct API and cheaper than Clearbit for that specific use case. Worth testing if your ICPs are active on those platforms.

The honest conclusion: there's no single tool that wins every category. The OSINT layer isn't a replacement for paid enrichment — it's what makes paid enrichment 4–10x cheaper by narrowing when you actually need it.

Top comments (0)