Paul Reaney

Posted on Jun 8

The 'John Smith' problem: detecting podcast guest appearances without false positives

#webdev #showdev #indiehackers #saas

I listen to podcasts because of people, not shows. When a researcher or founder I like goes on someone's podcast, I want that one episode — but I don't want to subscribe to all 400 episodes of every show they might ever appear on.

There's no button for that anywhere. So I built one: GuestVine. You follow people; whenever one of them shows up as a guest on any podcast, that single episode lands in a custom RSS feed you subscribe to once, in whatever player you already use.

The fun part wasn't the web app — it was the detection. "Did this person appear as a guest on this episode?" sounds trivial and absolutely is not. Here's how I built it.

The shape of the system

No new player, no re-hosting audio. The whole thing is RSS in, RSS out:

[Podcast Index] --> [Detection Pipeline] --> [Postgres] --> [Feed Service] --> your RSS URL
                                                  ^
                                         [Control Panel] <--> [you]

The feed items we emit point at the original publisher's audio file. You can play episodes right there — inline on the site, or in whatever podcast app you subscribe the feed into — but we never re-host the audio: every enclosure is the publisher's own file, served from their CDN. We just decide what goes in the feed. Which means everything hinges on one question being answered correctly, at scale, with no human in the loop.

The actual hard problem: "did they appear, or were they just mentioned?"

Say you follow John Smith. I pull candidate episodes from Podcast Index and now have to classify each one. The failure modes are everywhere:

His name is in the title because he's the guest. ✅
His name is in the description because the host mentions him in passing. ❌
His name is in the title of an episode about a different John Smith. ❌
The episode has a structured <podcast:person> tag naming him as guest. ✅

A naive substring match delivers garbage. So detection is three layers: match → score → verify.

Layer 1 — ranked match signals

Not all evidence is equal. I match in priority order and record which signal fired:

export type MatchSignal =
  | "person_tag"          // <podcast:person role="guest"> — structured, strongest
  | "title_guest"         // full name in TITLE + a guest cue ("with", "feat.")
  | "title_plain"         // full name in TITLE, no cue
  | "description_guest"   // full name in DESCRIPTION + guest cue
  | "description_plain";  // full name in DESCRIPTION, no cue (weakest)

The gold standard is the <podcast:person> tag from the podcast namespace — structured metadata where a publisher explicitly says "this person was a guest." When it's present, the guesswork disappears. It usually isn't present, so I fall back to text, and lean on "guest cue" words — with, featuring, ft, joins, sits down with, in conversation with — to separate a guest from a name-drop.

Layer 2 — scoring, and the "John Smith" tax

Each signal has a base confidence:

const SIGNAL_SCORE: Record<MatchSignal, number> = {
  person_tag: 0.98,
  title_guest: 0.9,
  title_plain: 0.6,
  description_guest: 0.55,
  description_plain: 0.3,
};

Then the part I'm fondest of. A name made of two extremely common tokens — "John Smith," "Mike Jones" — is far more likely to be a coincidental match than "Lex Fridman" is. So common names pay a tax:

function commonNamePenalty(name: string): number {
  const tokens = name.toLowerCase().split(/\s+/).filter(Boolean);
  if (tokens.length < 2) return 0;
  const commonCount = tokens.filter((t) => COMMON_TOKENS.has(t)).length;
  if (commonCount >= 2) return 0.2;   // "john smith" — heavy damp
  if (commonCount === 1) return 0.08; // "john fridman" — light damp
  return 0;
}

Crucially, the penalty is exempt for person_tag matches — if a publisher structurally tagged the guest, I trust it regardless of how common the name is. The penalty only applies to the fuzzy text signals where coincidence is actually possible.

Layer 3 — verify, and "start strict"

Score collapses to three tiers, and the tier decides the action:

let tier: Tier;
if (score >= 0.8) tier = "A";       // auto-deliver
else if (score >= 0.4) tier = "B";  // hold for verification
else tier = "C";                    // drop, silently

Tier	Meaning	Action
A	structured tag, or titular guest context	auto-deliver
B	name present but ambiguous	hold; verify before delivering
C	passing mention / low signal	drop

The product decision baked in here: start strict. Only Tier A auto-delivers. A missed appearance is invisible — you just never knew it existed. A wrong appearance is loud and corrosive: it teaches you the feed is junk, and you unsubscribe. For a trust product, precision beats recall every time. I'd rather under-deliver and stay credible.

The Tier-B escape hatch: an LLM as a tie-breaker

Tier B is the interesting middle — real signal, real ambiguity. Rather than drop it, I optionally hand it to an LLM (Claude) with the episode metadata and the person's disambiguating context, and ask one narrow question: is this plausibly this specific person, as a guest? If it promotes the match, it ships; otherwise it stays held.

The key restraint: the LLM is a tie-breaker, not the pipeline. It never sees Tier A (no need) or Tier C (not worth the tokens). It only adjudicates the genuinely ambiguous middle band. That keeps cost bounded and keeps the deterministic scoring in charge of the easy 90%.

Things that bit me

Unspecified <podcast:person> role defaults to "host," not "guest." Per the spec, a missing role means host. Get this backwards and you deliver every host as if they were a guest — a flood of false positives from the highest-trust signal. Brutal.
Players cache RSS aggressively. "Why isn't my new episode showing up" was almost always the player, not me. Worth knowing before you debug your own feed generator for an hour.
The whole thing is testable without the network. Match and score are pure functions over normalized episode structs, so the test suite runs against recorded fixtures — no API key, no flakiness. The detection logic above is all covered by plain Vitest unit tests, which made tuning the penalties safe.

The stack, briefly

Next.js (App Router) for the control panel, API, and RSS serving · Postgres + Prisma for people/feeds/episodes/appearances and the fan-out · passwordless auth (magic link + OTP in one email) · the detection worker above on a cron · Claude for the Tier-B verifier · Vitest for the matching/scoring/feed logic.

Try it

That precision-first detection is the core of GuestVine
— follow people, not shows, and their guest appearances land in whatever
podcast app you already use. Free for a few follows.

If you try it, the one piece of feedback I'd love: is getting your feed into
your podcast app smooth enough? That's the step I'm least sure about. Follow some people, grab your feed URL, paste it into any podcast app once — guest appearances arrive on their own. Play them inline or in your player; either way the audio streams from the original publisher, never re-hosted. There's a free tier.

I'm happy to go deeper on any layer — the namespace parsing, the scoring tuning, or how the RSS fan-out works across multiple feeds per user. Ask in the comments.

DEV Community