AI + TMDB: 3 Passes to Match Torrent Posters — Prompt Iteration With Real Numbers

#ai #promptengineering #claudecode #tmdb

ShareBox displays shared folders as a Netflix-style grid with TMDB posters. The problem: folder names come from torrents. Naruto.INTEGRALE.MULTI.VFF.1080p.BluRay.x264-AMB3R needs to match "Naruto" on TMDB — not "Naruto Shippuden", not "Naruto the Movie". And Vol 1 must definitely not match "Kill Bill: Volume 1".

Basic regex + TMDB search works for 80% of cases. For the remaining 20%, I built a 3-pass AI pipeline (Claude Haiku via CLI) with a cron every 30 minutes. Here's each pass in detail, the exact prompts, and iterations measured on 290 real entries.

The pipeline: regex first, AI as safety net

The architecture is layered, cheapest to most expensive:

Regex + TMDB (inline, every browse): extract_title_year() cleans the name, searches TMDB, takes the first result with a poster. Free, instant, correct ~80% of the time.
Pass 1: AI extraction (cron --pending): for names where regex failed, send the raw name to Claude Haiku to extract a clean title, then re-search TMDB.
Pass 2: AI verification (cron --verify): send {name, matched TMDB title} pairs to AI to detect false positives. If false → suggest a better title.
Pass 3: candidate selection (when pass 2 detects a false positive): search TMDB with the suggested title, get 15 candidates, send the list to AI to pick the right one.

Pass 1: title extraction — the prompt that skips too much

The first prompt was simple: "extract the proper movie title for a TMDB search." Tested on 290 real names, it produced 72 false skips — the AI considered "Naruto.INTEGRALE", "Pokemon La Series", "Despicable Me COLLECTION" as non-titles and marked them skip=true.

The fix: explicit rules about what to keep vs. skip, a "when in doubt, skip=false" rule, and instructions to translate known English titles to French. Result: 72 → 41 skips. 31 improvements, zero regressions.

Pass 2: verification — 46 false negatives on seasons

The verification prompt sent {name, TMDB title} pairs and asked correct: true/false. On 247 entries, it flagged 55 as incorrect. But 46 were false negatives.

The AI didn't know that S01 → "Season 1" is a correct match — it's a TMDB season poster, not a generic match. Same for all 34 Simpsons seasons, 11 Walking Dead seasons, 4 Batman seasons.

The fix: a "Special cases — do NOT mark as incorrect" section explaining that season folders matched to season titles are correct, and translations/saga names are fine. Result: 55 → 9 incorrects. All 9 are real problems. Zero false negatives.

Pass 3: the pick that solves the TMDB problem

When pass 2 detects a false positive and suggests "Naruto" as a better title, we search TMDB. Problem: TMDB returns results by popularity. "Naruto" → Naruto Shippuden (more popular). Taking the first result reproduces the error.

The solution: get 15 TMDB candidates (via multi + tv + movie endpoints), send the full list to AI with the filename for context. The AI picks {"idx": 1} — Naruto (2002), the original series. The word "INTEGRALE" in the filename helps it understand this is the complete series, not a spin-off.

A gotcha: Claude sometimes adds explanations after the JSON, breaking parsing. Fix: extract {"idx": N} via regex instead of full JSON parsing.

Final numbers

Prompt

Before

After

Improvement

Pass 1 (extraction)

72 false skips

-43%

Pass 2 (verification)

55 false negatives

9 (all real)

-84%

Pass 3 (candidate pick)

4 parse failures

-100%

What I learned

Measure before iterating. Without 290 real entries as a benchmark, I would have iterated blindly. The numbers showed pass 2 v1 had 84% false negatives — impossible to see without real data.

Edge cases dominate. 46 out of 55 false negatives came from one pattern: season folders. One line in the prompt ("seasons matched to Season N are CORRECT") eliminated 84% of errors. The 80/20 rule applies to prompts too.

Parsing matters as much as the prompt. A perfect prompt is useless if parsing breaks. The AI adds text, code fences, explanations. Regex extraction is more reliable than json_decode().

Layered architecture reduces costs. Free regex handles 80%. AI only runs on the remaining 20%. Pass 3 (the most expensive) only fires when pass 2 detects a problem — 9 times out of 290 entries.

The best prompt isn't the one with the most instructions — it's the one that precisely describes edge cases. "When in doubt, skip=false" and "seasons are CORRECT" are worth more than 20 lines of generic rules.