Hey devs! I recently shipped an AI-powered auto-dubbing web service built almost entirely with the help of coding agents (Claude Code + Google Gemini), and I wanted to share what I learned along the way.
https://github.com/user-attachments/assets/e77a38d9-8ecb-4010-aff0-bfc8d9d80cf3
🐙 GitHub: https://github.com/edwardyun12/ai-dubbing-web-app
📌 What Does It Do?
Upload any audio or video file → get it dubbed in your target language automatically.
Here's the full pipeline:
Upload file → transcribe speech via ElevenLabs API
Transcription → translate with Google Translate API
Pick an AI voice for the target language (male/female, calm/energetic, etc.)
Synthesize dubbed audio with ElevenLabs TTS
Play back and download the result
🛠️ Tech Stack
Category Technology
Framework Next.js (TypeScript)
Deployment Vercel
Database Turso
Voice API ElevenLabs
Translation Google Translate API
Auth Google OAuth (NextAuth)
Dev Agents Claude Code, Google Gemini
🤖 How Claude Code Actually Helped (Honestly)
The hardest parts weren't the features themselves — they were the architecture decisions and debugging.
Here's where Claude Code made a real difference:
NextAuth + Turso DB integration
The auth logic was surprisingly complex. Once I gave Claude a clear spec of the input/output I needed, it generated accurate, working code on the first try.
Next.js Suspense errors
I kept hitting client component boundary issues. Pasting the raw error log into Claude immediately surfaced the root cause and the fix.
Env variable security
Claude proactively flagged a .env.local mistake before it became a problem.
I also used Google Gemini (AntiGravity) for fine-grained debugging and documentation, then cross-validated between the two agents to boost code stability.
💡 Tips for Working With Coding Agents
After building this project, here's what actually works:
- Be specific about I/O
Instead of "make a dubbing function," say "write a function that takes a
Fileinput and returns a translated audioBlob." Precision = better output. - Break it into small units Asking for everything at once creates tangled logic. Feature-by-feature requests consistently produce cleaner results.
- Paste error logs raw Don't summarize. Just paste the full stack trace. The agent picks up on details you might miss.
- Always review before you commit Treat agent code like a junior dev's PR. Read it, understand it, then ship it. > 💡 The process of collaborating with agents to solve problems matters more than the polish of the final code. --- ⚙️ Want to Run It Locally? Prerequisites You'll need API keys from: Service Purpose ElevenLabs Speech transcription & synthesis Turso Database Google Cloud OAuth + Translate API Vercel Deployment Steps
git clone https://github.com/edwardyun12/ai-dubbing-web-app.git
cd ai-dubbing-web-app
Create a .env.local file:
ELEVENLABS_API_KEY=your_elevenlabs_api_key
TURSO_DATABASE_URL=your_turso_database_url
TURSO_AUTH_TOKEN=your_turso_auth_token
GOOGLE_CLIENT_ID=your_google_client_id
GOOGLE_CLIENT_SECRET=your_google_client_secret
NEXTAUTH_SECRET=your_nextauth_secret
NEXTAUTH_URL=http://localhost:3000
Then:
npm install
npm run dev
Open http://localhost:3000 and you're good to go!
If this was useful or interesting, a ⭐ on GitHub would mean a lot!
👉 https://github.com/edwardyun12/ai-dubbing-web-app
Happy to answer any questions in the comments 🙌
Top comments (0)