How to Turn Podcast Guest Databases into a Pre-Warmed ICP Prospecting List

#sales #productivity #webdev #automation

Three months ago I pulled 500 guests from B2B SaaS and DevOps podcasts and ran them through a full enrichment stack. The conversion-to-meeting rate was 4.1x higher than the matched cold list my team was working in parallel. I'm not going to claim the methodology is magic—there are real reasons why it works, and real places where it breaks down.

Here's the complete workflow.

Why Podcast Guests Convert at All

A podcast guest has done something useful for you before you ever contacted them: they've gone on record about a problem they're solving, a decision they made, or a priority they're betting on. That's not a data point you can buy from any enrichment API.

When I cold-call someone off a scraped LinkedIn list, I'm guessing at their pain. When I reach out to someone who just spent 40 minutes on a podcast explaining why they rebuilt their data infrastructure from scratch, I have a conversation starter that's real, current, and specific to them.

The second reason: podcast guests are self-selected communicators. They agreed to talk publicly. They're not the same population as someone who last updated their LinkedIn profile in 2019 and ignores every InMail.

The conversion lift I saw—4.1x over a 90-day measurement window—held across both VP-level and Director-level personas, which surprised me. I expected it to compress at the Director level. It didn't.

Step 1: Build Your Raw Guest List with the Listen Notes API

Listen Notes is the starting point I use. It indexes over 3.7 million podcasts and 189 million episodes with a clean REST API.

The endpoint that matters here is /api/v2/search. The useful parameters:

q: your topic keyword ("data infrastructure" OR "B2B SaaS growth" OR "RevOps")
type: set to episode
pub_date_ms_min: episodes published in the last 90 days (stale guests are less valuable)
genre_ids: 67 (Business), 93 (Technology)
only_in: title,description — you don't want unrelated matches from transcript body

Each result returns podcast_title, episode_title, description, pub_date_ms, and listennotes_url. The description field is where guests are usually named and their roles mentioned.

From a search returning ~600 raw episodes, parsing the description field for titles like "VP", "Head of", "Director", "Founder", "CTO" filtered that down to 210 episodes. From those, I extracted 183 distinct guests with enough identifying information (full name + company) to attempt enrichment.

Alternative source: Podchaser has better guest metadata for its indexed shows—it explicitly separates host from guest records—but the API is more expensive and the coverage is narrower than Listen Notes.

Step 2: Filter Against Your ICP Before Spending on Enrichment

Enrichment credits are not free. PDL charges per API call whether the record matches or not. Running 183 guests through PDL without qualification is how you burn budget on irrelevant records.

I filter before enriching using the data already in the episode metadata:

Company name against an exclusion list: agencies, podcast production companies, consulting firms. If the company name ends in "Media", "Studio", or "Agency" I skip them.
Title seniority signal: I want VP+ or Founder-level. The description text usually has enough for a regex pass.
Episode topic match to ICP pain: I read a two-sentence summary for each remaining candidate—yes, manually. This takes 20 minutes for 80 records. It's worth it.

After filtering, my 183 guests compressed to 61 for enrichment. The filtering overhead saved me ~$35 in API costs and kept the final list clean enough that my SDR could personalize outreach without suffering through irrelevant contacts.

The Enrichment Waterfall (Where Most Guides Stop Too Early)

Most articles about contact enrichment recommend one tool and call it done. In practice, no single provider has complete coverage—certainly not for niche B2B personas who don't have heavy LinkedIn footprints.

My waterfall runs in this order:

Layer 1 — Identity resolution via PDL

PDL's /person/enrich endpoint takes name + company and returns a full profile: work email, LinkedIn URL, job title, company size, industry, and sometimes mobile. Coverage on my test set was 73% (45 of 61 records returned a match). For people with less common names or at smaller companies, PDL gaps out.

Layer 2 — Domain-based email via Hunter.io

For the 16 records PDL couldn't resolve, I ran Hunter.io's Domain Search against the company domain. Hunter's pattern confidence scoring ("pattern_type": "generic" vs "first.last") lets me skip probable misses. Got 9 additional emails this way, 7 with confidence ≥ 85%.

Layer 3 — Phone enrichment via RocketReach

PDL doesn't reliably return direct dials. For anyone I really want to call, I run RocketReach's lookup endpoint. It found direct dials for 31 of my 54 enriched contacts (57%). Podcast guests take calls at a higher rate than cold list contacts, so the dial rate matters here.

Layer 4 — Email verification via ZeroBounce

Before anything goes to sequence, every email runs through ZeroBounce. I exclude invalid, catch-all, and spamtrap statuses. This knocked out 6 addresses from my final list but protected my domain reputation.

Final result: 48 verified, enriched contacts from 61 raw guests.

Enrichment Tool Comparison for Podcast Guest Lists

Not all enrichment tools handle this workflow equally. The challenge is name + company resolution (rather than email-domain lookup), which some tools handle badly.

Tool	Best role in this workflow	Email coverage	Direct dial	Pricing
PDL	Layer 1: identity resolution	73%	Partial	Per-API-call
Hunter.io	Layer 2: domain fallback	55% on gaps	No	$49+/mo
Clay	Waterfall orchestration	80–90% (combined)	Via connectors	Credit-based
Apollo.io	Quick self-serve lookup	65–70%	Partial	$49+/mo
RocketReach	Phone enrichment	57% (direct dial)	Yes	$80+/mo
Lusha	European contacts	50–60% (EU)	Yes	$29+/mo
ZeroBounce	Email verification	—	No	Pay-per-verification

Clay can replace layers 1–3 if you want a single interface—it runs PDL, Hunter.io, Apollo.io, and others in sequence automatically. The tradeoff: the credit system is genuinely unpredictable when testing, and failed enrichments still cost credits. I use Clay for ongoing automated workflows, and direct API calls when I'm prototyping or validating a new persona type.

The n8n Automation That Ties It Together

Once I validated the workflow manually, I rebuilt it in n8n so it runs on a weekly schedule.

The flow:

Schedule trigger → HTTP Request node → Listen Notes API (/api/v2/search with a 7-day pub_date window)
Code node → parse description for guest names, titles, and company mentions; filter by title seniority regex
Google Sheets node → append to a "raw_guests" sheet if not already present (dedup by name + company)
Webhook node → pause for manual ICP review (I use a simple Airtable form to approve/reject)
HTTP Request node → PDL enrichment for approved records
IF node → if PDL returns no email, fork to Hunter.io domain search
HTTP Request node → ZeroBounce verification on all returned emails
HTTP Request node → RocketReach for contacts flagged as "high priority"
HubSpot / Salesforce / Outreach node → push verified records to the outbound sequence

Total runtime for a weekly batch of 40–50 guests: about 8 minutes of compute. Manual time: 20 minutes for the ICP review step—which I keep manual on purpose, because that's where bad data compounds into wasted SDR cycles.

The n8n Listen Notes node isn't native; you build it as a generic HTTP Request node. That's fine—the API is straightforward. What takes time is getting the Code node's regex to reliably extract guest names. I've gone through four iterations. It's still not perfect on shows that format their descriptions as run-on paragraphs.

What I Actually Use

For this specific workflow—podcast guest identification through to enriched, sequenced contact—my current stack is:

Listen Notes for source data ($99/mo Starter API tier)
PDL as the primary enrichment layer (~$0.10–$0.15 per resolved record at my volume)
Hunter.io for fallback email discovery
ZeroBounce for verification ($16 for 2,000 credits—that's 2–3 months of supply at my volume)
n8n (self-hosted) to automate the pipeline

For Twitter and Facebook guest cross-referencing—when the podcast description mentions a guest's social handle and I want to pull richer profile data—Ziwa has been faster for me than PDL's direct API, specifically for social handle-based lookups rather than professional identity resolution.

Total monthly cost runs $180–220 for 150–200 enriched contacts. The cost-per-meeting-booked from this channel is lower than any other outbound approach I've run, primarily because the personalization leverage from podcast context compresses the work per conversation.

The one gap I haven't fully solved: guest name extraction from raw episode descriptions is still fragile. Podcasts are inconsistent—some list guests as "featuring John Smith, VP of Engineering at Acme", some bury it in paragraph three, some only mention the guest's first name. An LLM extraction step handles that variance better than regex. I've been testing it but haven't moved it to production yet.