Jmcraft

Posted on Mar 8

How to Convert Apple Podcasts Episodes to Text — Free, No Setup

#ai #podcast #productivity #tutorial

Audio Is Great for Listening. Terrible for Everything Else.

You just heard an incredible insight on an Apple Podcasts episode. Now you need to quote it in a blog post, pull it into show notes, or reference it in a report. Your options: re-listen and type it out manually, or… don't.

Manual transcription runs about 4x real-time. That's 4 hours of typing for a 1-hour episode. For anyone who works with podcast content regularly — creators, marketers, journalists, researchers — this is an unacceptable bottleneck.

Vocova eliminates it. Paste an Apple Podcasts episode link, and the AI returns a complete transcript with speaker labels, timestamps, and export options in multiple formats. Free, browser-based, no Apple account needed.

What Vocova Does

Vocova is a browser-based AI transcription tool optimized for podcast audio. Here's the feature set:

99%+ accuracy on well-recorded conversational audio — interviews, panels, solo shows
Automatic speaker diarization — labels each voice separately, distinguishing hosts from guests
100+ language auto-detection — no manual language selection
Timestamped segments — every line maps to a precise point in the original audio
No duration limits — 5-minute clips to 3-hour marathons, all handled
In-browser editing — fix proper nouns, jargon, or formatting before export
Keyword search — find specific topics within the transcript instantly
Export: TXT, DOCX, PDF, SRT, VTT
Privacy-first — recordings are not stored or shared

3 Steps: Apple Podcasts Episode → Text

Step 1: Copy the Episode Link

From Apple Podcasts (Mac, iOS, or podcasts.apple.com):

App: Tap the ··· menu next to the episode → Copy Link
Web: Navigate to the episode on podcasts.apple.com and copy the URL from the address bar

Any public episode works. No Apple account required on your end.

Step 2: Paste and Transcribe

Go to vocova.app, paste the link, and start transcription. Vocova auto-detects the Apple Podcasts format, extracts the audio, and runs speech recognition with speaker detection.

A 45-minute episode typically processes in a few minutes.

Step 3: Search, Edit, Export

The transcript loads on screen with speaker labels and clickable timestamps. From here:

Search for keywords to jump to relevant sections
Edit any line inline — useful for technical terms, brand names, proper nouns
Export as:
- TXT — for drafts, show notes, LLM input
- DOCX — for collaborative editing
- PDF — for archives and deliverables
- SRT / VTT — for video podcast subtitles

Use Cases That Actually Matter

Show Notes That Write Themselves

Stop writing show notes from memory. The transcript gives you every discussion point, guest quote, resource mention, and topic transition — with timestamps. Extract what you need, format it, publish. More accurate, more detailed, 10x faster.

One Episode → A Week of Content

A 60-minute interview transcript is raw material for 4–5 blog posts, a newsletter, a tweet thread, and a set of pull quotes for social. Export as TXT, drop it into your editor or feed it to an LLM for summarization and reformatting. The content pipeline starts with text.

SEO: Make Google Index Your Podcast

Google doesn't listen to audio. Every word spoken in your episode is invisible to search engines — unless you publish a transcript. Full-text transcripts on episode pages expose hundreds of long-tail keywords to organic search. This is the single highest-ROI SEO tactic for podcasters.

Accurate Quotes for Promotion

Promoting an episode on social? Search the transcript for the most compelling moments. Copy the exact quote, pair it with an audiogram clipped at the matching timestamp. No more paraphrasing from memory.

Searchable Episode Library

After 100+ episodes, your podcast is a knowledge base — but only if you can search it. Transcripts make every episode text-searchable by keyword. Find the exact episode and timestamp where a topic was discussed, without re-listening to anything.

Accessibility

~430 million people globally have disabling hearing loss. Transcripts make your content accessible to this audience, plus non-native speakers and anyone consuming content in quiet environments. It's a best practice and increasingly a compliance requirement.

What Episodes Work?

Any publicly available episode on podcasts.apple.com:

Interview and conversation shows
Solo host monologues
Panel discussions and roundtables
News and analysis podcasts
Educational and lecture-format content
Narrative and storytelling shows
Episodes in any of 100+ supported languages

Private or subscriber-only episodes without public URLs are not supported.

Vocova vs. DIY Transcription Pipelines

If you're technically inclined, you might consider assembling your own pipeline with Whisper + pyannote + ffmpeg. Here's the trade-off:

Setup: Vocova = open a browser tab. DIY = install Python, ffmpeg, download models, configure GPU, write glue code for speaker diarization
Speaker labels: Vocova includes diarization natively. DIY requires integrating pyannote or WhisperX separately
Subtitle export: Vocova outputs SRT/VTT directly. DIY requires post-processing Whisper's raw output
Editing: Vocova has inline editing in the UI. DIY = edit a text file
Long episodes: Vocova processes server-side with no local resource constraints. DIY needs sufficient RAM/VRAM

DIY wins for offline use, bulk automation, and custom integration. Vocova wins for everything else — especially when you just want a transcript and don't want to maintain a pipeline.

Tips for Best Results

Audio quality is the biggest variable. Professionally produced Apple Podcasts shows (most of them) yield near-perfect results. Lo-fi recordings with background noise may need a few edits.
Check speaker labels on 3+ person episodes. Two-speaker detection is highly reliable. With more voices, a quick scan ensures correct attribution.
Search before you export. If you only need a specific segment, use the in-browser search to find it — faster than scanning an exported document.
Match export format to use case. TXT for content drafts. DOCX for team collaboration. SRT/VTT for video subtitles. PDF for archives.
Make it routine. Transcribe every episode as part of post-production. The SEO and content benefits compound as your transcript library grows.

Bottom Line

Apple Podcasts content is too valuable to leave locked in audio. Transcription turns episodes into searchable, quotable, repurposable text — and Vocova makes it a paste-and-click operation.

Free. Browser-based. 100+ languages. Speaker detection. No setup.

Give it a try: 👉 https://vocova.app/

FAQ

Is Vocova free for Apple Podcasts transcription?
Yes. Vocova offers free transcription for any public Apple Podcasts episode. No account, no credit card, no per-minute charges. Paste an episode link at vocova.app and get a full speaker-labeled, timestamped transcript.

How accurate is the transcription?
Vocova achieves 99%+ accuracy on well-recorded podcast episodes with clear audio. It handles multiple speakers, various accents, and conversational speech reliably. An inline editor is available for correcting specialized terms or proper nouns after processing.

Does it detect different speakers?
Yes. Vocova includes automatic speaker diarization that identifies and labels each voice in the episode. It separates host dialogue from guest dialogue, making transcripts easy to follow and accurate to quote — especially useful for interview-format shows.

What export formats are available?
Five: TXT (plain text), DOCX (Word document), PDF (print-ready), SRT (SubRip subtitles), and VTT (WebVTT). SRT and VTT include precise timestamps and can be uploaded directly as subtitles to YouTube, web video players, or video editing software.

Can it handle long episodes?
Yes. Vocova has no duration limits — it processes episodes of any length, from short segments to multi-hour interviews. Processing time scales proportionally, with hour-long episodes typically completing in a few minutes. The workflow is fully automatic once you paste the link.

DEV Community