I want to tell you something that most "how to start a podcast" guides skip: the gear and software questions are mostly solved now. The hard part isn't technical. The hard part is making something worth listening to.
But the technical part has gotten genuinely easier, and AI has made it cheaper in ways that would have seemed impossible five years ago. So let's talk through the actual workflow -- gear, recording, editing, AI-generated intros, show notes, and distribution -- with both a budget path and a pro path.
I've been doing video content for years and have spent time with podcast workflows. This guide reflects what actually works, not what looks good in a feature matrix.
Step 1: Record Clean Audio
Every AI editing tool in existence works better when it starts with cleaner source material. Garbage in, less garbage out -- the AI tools can clean up a lot, but they can't fix everything, and the better your raw recording, the better your results will be at every stage after.
The gear basics:
You don't need to spend much. The difference between $50 gear and $500 gear is real but smaller than most gear guides make it sound. The difference between good technique and bad technique is bigger than any gear decision.
My recommendation for starting out: the Audio-Technica ATR2100x at around $100. USB and XLR connectivity. Cardioid pattern that rejects room noise from the sides and back. Built-in headphone monitoring. It's what I'd buy if I was starting over today.
One step up: the Rode PodMic around $100-120. Better rejection of room noise. Slightly warmer sound that most people prefer for voice. Worth the extra $20.
Both of these are dramatically better than recording on a laptop microphone or phone. Neither requires an audio interface for USB recording.
The room:
This matters more than the microphone. A $50 microphone in a treated room sounds better than a $300 microphone in an echoey concrete office.
You don't need professional acoustic panels. You need: carpet or rugs on hard floors, soft furniture, closed doors, and a position where you're not directly between two parallel walls. Recording in a closet full of clothes is genuinely one of the better low-budget acoustic solutions because the clothes absorb reflections.
Speak close to the microphone -- 6 to 8 inches -- and slightly off-axis (speak past the mic, not directly at the capsule). This reduces plosives (the hard P and B sounds that make some recordings boom).
Step 2: AI-Powered Editing with Descript
This is where AI changes everything for podcasters.
Descript is a podcast and video editor that works differently from every traditional DAW or editing app. Instead of editing audio waveforms -- cutting and moving colored blobs on a timeline -- you edit a transcript. You read through the auto-generated transcript of your recording and delete the words you don't want. The audio follows.
For podcasters, this is transformative. Most podcast editing is removing: removing "um"s and "uh"s, removing stumbles and re-takes, removing the pauses where the host lost their train of thought. Descript's AI auto-detects filler words and lets you remove them all with one click. It auto-detects pauses longer than a set threshold and tightens them. You can literally clean up an entire podcast episode by reading and deleting, which is faster than traditional audio editing by a significant margin.
Getting started with Descript:
Import your audio recording. Descript accepts MP3, WAV, M4A, and most other common formats. Drag and drop into a new project.
Wait for transcription. Takes 2-5 minutes for a standard episode. The free tier includes 1 hour of transcription per month. Creator tier ($24/month) gives you 10 hours.
Read through the transcript. Correct any transcription errors (the accuracy is good but not perfect, especially on names and jargon).
Remove filler words. Action → Remove Filler Words. Select which filler words to remove ("um," "uh," "like," etc.) and how many to target. Preview before applying.
Tighten pauses. Action → Shorten Silences. Set a maximum silence length (I use 0.8 seconds as a starting point) and let it compress all the dead air.
Clean up transcript. Highlight and delete any section you want removed -- a stumble, a section that didn't land, an off-topic tangent. The audio disappears with the text.
Noise removal: Descript's built-in noise removal is solid for basic background noise (air conditioning, room hum). For more aggressive noise removal or audio quality issues, the next tool helps.
Adobe Podcast Enhance
Adobe offers a free web tool called Adobe Podcast Enhance (also called Mic Check in some contexts) that takes uploaded audio and runs it through AI processing to clean it up. The results are genuinely impressive for a free tool -- it removes background noise, reduces reverb, and makes voice recordings sound more like a professional studio environment.
The workflow: export your rough-edited audio from Descript, run it through Adobe Podcast Enhance, get a cleaned file back. Takes a few minutes. Free.
For the budget path, this is the best audio polish available without spending anything.
For the pro path, tools like iZotope RX do more detailed and controllable noise work, but cost significantly more and have a steeper learning curve. For most podcasters, Adobe Podcast Enhance is enough.
Step 3: AI Voice for Intros and Outros
If your podcast has a professional-sounding intro and outro -- the kind with a narrator reading something like "Welcome to The Creative Brief, with Ray Whitfield" -- you have a few options.
Record it yourself. Hire a voice actor. Or use AI voice.
ElevenLabs at $22/month gives you 100,000 characters of AI voice per month with commercial licensing. More importantly, it lets you clone your own voice -- so your intro and outro sound like you, even if they were generated, not recorded.
Clone-yourself workflow:
- Record 1-5 minutes of yourself speaking clearly and naturally
- Upload to ElevenLabs, create an Instant Voice Clone
- Type your intro script, generate with your cloned voice
- Drop the generated audio into Descript before exporting
Murf AI is the alternative if you want a more structured production workflow -- they have music and intro production built into their platform. See our full Murf AI review for details.
For the budget path: the ElevenLabs free tier (10,000 characters/month) is enough to generate intro and outro scripts without paying anything, as long as your scripts are short. Just note the free tier is personal use -- for commercial podcasting, you'd need a paid plan.
Step 4: Show Notes and Transcripts with AI
Descript auto-generates transcripts as part of the editing process -- this is the same transcript you edited in Step 2. When you're done editing, you have a clean transcript ready to publish as show notes or episode accessibility text.
For polished show notes (summary, key points, chapter markers), you can take the Descript transcript and run it through any AI writing tool. Paste the transcript, ask for a 200-word summary and five key takeaways, and you have show notes.
This workflow -- record, edit in Descript, clean with Adobe, generate show notes from the transcript -- produces professional podcast output in significantly less time than traditional editing.
Step 5: Distribution Basics
A podcast is just an RSS feed. You create an audio file, upload it somewhere that generates an RSS feed, and submit that feed to Apple Podcasts, Spotify, and other directories.
Budget path: Buzzsprout free tier. Hosts up to 2 hours of audio per month free, with episodes expiring after 90 days. Enough to start without spending anything.
Paid step-up: Buzzsprout paid plans start at $12/month for unlimited episode lifetime and more storage. Transistor at $19/month is another strong option with better analytics and private podcast support.
Both Buzzsprout and Transistor generate the RSS feed and handle directory distribution with simple submission tools.
Complete Budget Path (Free or Nearly Free)
- Recording: Any decent USB microphone you already have, or the ATR2100x ($100 one-time)
- Editing: Descript free tier (1 hour transcription/month)
- Noise removal: Adobe Podcast Enhance (free)
- AI intro voice: ElevenLabs free tier (10,000 chars/month, personal use)
- Distribution: Buzzsprout free tier
- Total ongoing cost: $0/month if you stay in free tiers
This path works. It produces a real podcast. The limitations (transcription hours, episode storage, commercial rights for AI voice) are real but manageable when starting out.
Complete Pro Path
- Recording: Rode PodMic + Focusrite Scarlett Solo audio interface (~$250 one-time)
- Editing: Descript Creator at $24/month
- Voice/intros: ElevenLabs Creator at $22/month (voice cloning, commercial rights)
- Distribution: Buzzsprout or Transistor at $12-19/month
- Total ongoing cost: ~$60/month
At that budget, you've got a fully professional podcast production setup that would have cost five times as much in time and specialized software five years ago.
Related reading:
Top comments (0)