<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: 최효식</title>
    <description>The latest articles on DEV Community by 최효식 (@hyosikkk).</description>
    <link>https://dev.to/hyosikkk</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3823602%2F03133835-91d2-4a44-b0e1-bee2f280470e.png</url>
      <title>DEV Community: 최효식</title>
      <link>https://dev.to/hyosikkk</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/hyosikkk"/>
    <language>en</language>
    <item>
      <title>Development of a dubbing service using Claude</title>
      <dc:creator>최효식</dc:creator>
      <pubDate>Sat, 14 Mar 2026 08:09:30 +0000</pubDate>
      <link>https://dev.to/hyosikkk/development-of-a-dubbing-service-using-claude-1d09</link>
      <guid>https://dev.to/hyosikkk/development-of-a-dubbing-service-using-claude-1d09</guid>
      <description>&lt;h1&gt;
  
  
  Building an AI Dubbing Service: My Development Journey
&lt;/h1&gt;

&lt;h2&gt;
  
  
  The Idea
&lt;/h2&gt;

&lt;p&gt;"What if users could upload audio/video files and have AI automatically dub them into different languages?"&lt;/p&gt;

&lt;p&gt;This simple concept evolved into a fully functional service deployed in just one day.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;AI Dubbing Service&lt;/strong&gt; — A web app that converts audio/video files into multiple languages automatically.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pipeline:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Upload audio/video file&lt;/li&gt;
&lt;li&gt;Extract speech (STT) using ElevenLabs Scribe v1&lt;/li&gt;
&lt;li&gt;Translate text using Google Cloud Translation API&lt;/li&gt;
&lt;li&gt;Generate dubbed audio (TTS) using ElevenLabs Multilingual v2&lt;/li&gt;
&lt;li&gt;For videos: Merge original video + dubbed audio using ffmpeg.wasm&lt;/li&gt;
&lt;li&gt;Download the result&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Supported Languages:&lt;/strong&gt; Korean, English, Japanese, Spanish&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;File Support:&lt;/strong&gt; MP3, WAV, MP4, WebM (up to 100MB)&lt;/p&gt;




&lt;h2&gt;
  
  
  The Tech Stack
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Technology&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Frontend&lt;/td&gt;
&lt;td&gt;Next.js 15 (App Router)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Language&lt;/td&gt;
&lt;td&gt;TypeScript&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Styling&lt;/td&gt;
&lt;td&gt;Tailwind CSS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Auth&lt;/td&gt;
&lt;td&gt;NextAuth.js + Google OAuth&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Database&lt;/td&gt;
&lt;td&gt;Turso (libSQL)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;File Storage&lt;/td&gt;
&lt;td&gt;Vercel Blob&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Video Processing&lt;/td&gt;
&lt;td&gt;ffmpeg.wasm&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deployment&lt;/td&gt;
&lt;td&gt;Vercel&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  The Development Process
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Why One Day Was Possible
&lt;/h3&gt;

&lt;p&gt;The traditional approach would take &lt;strong&gt;2-3 weeks:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Week 1: Design architecture, set up services&lt;/li&gt;
&lt;li&gt;Week 2: Implement each API integration separately&lt;/li&gt;
&lt;li&gt;Week 3: Build UI, test, debug&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What changed:&lt;/strong&gt; I used Claude Code (AI coding agent) to compress this timeline.&lt;/p&gt;

&lt;h3&gt;
  
  
  Hour-by-Hour Breakdown
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Hour 1: Architecture Design&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Described all requirements&lt;/li&gt;
&lt;li&gt;Claude Code designed the entire Next.js structure&lt;/li&gt;
&lt;li&gt;File routes, API endpoints, component hierarchy — all at once&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Hours 2-3: API Integration&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;ElevenLabs STT setup&lt;/li&gt;
&lt;li&gt;Google Translate API connection&lt;/li&gt;
&lt;li&gt;ElevenLabs TTS implementation&lt;/li&gt;
&lt;li&gt;All with consistent error handling patterns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Hours 4-5: UI Implementation&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dark glassmorphism design&lt;/li&gt;
&lt;li&gt;Real-time soundwave animation&lt;/li&gt;
&lt;li&gt;Progress indicators (STT → Translation → TTS)&lt;/li&gt;
&lt;li&gt;Drag-and-drop file upload&lt;/li&gt;
&lt;li&gt;Side-by-side video comparison player&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Hours 6-7: Video Processing&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;ffmpeg.wasm integration&lt;/li&gt;
&lt;li&gt;Video + audio merging in the browser&lt;/li&gt;
&lt;li&gt;Auto-sync adjustment using playbackRate&lt;/li&gt;
&lt;li&gt;MP4 download functionality&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Hours 7-8: Auth &amp;amp; Deployment&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Google OAuth configuration&lt;/li&gt;
&lt;li&gt;Whitelist system with Turso&lt;/li&gt;
&lt;li&gt;Vercel deployment&lt;/li&gt;
&lt;li&gt;Custom domain setup&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Key Challenges &amp;amp; Solutions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Challenge 1: Sync Audio to Video
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; Dubbed audio might be slightly longer/shorter than original video.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt; Calculate playbackRate ratio&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;playbackRate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;originalVideoDuration&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nx"&gt;dubbedAudioDuration&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nx"&gt;audioElement&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;playbackRate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;playbackRate&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This auto-adjusts dubbed audio speed to match video length perfectly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Challenge 2: Browser-Side Video Processing
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; Video merging usually requires a server with FFmpeg installed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt; Use ffmpeg.wasm to run FFmpeg directly in the browser&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No server needed&lt;/li&gt;
&lt;li&gt;No file uploads to slow servers&lt;/li&gt;
&lt;li&gt;Instant local processing&lt;/li&gt;
&lt;li&gt;Privacy-friendly (files never leave user's device)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Challenge 3: Consistent API Integration
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; Multiple external APIs with different error patterns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt; Create unified error handling&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Centralized error types&lt;/li&gt;
&lt;li&gt;Retry logic with exponential backoff&lt;/li&gt;
&lt;li&gt;Consistent response formats&lt;/li&gt;
&lt;li&gt;Better debugging&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Challenge 4: Claude Code Communication
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; Generic prompts led to mediocre results.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt; Be extremely specific&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"Make a file upload" → "Create a /api/upload route that accepts FormData with MP4 files, validates size &amp;lt;100MB, uploads to Vercel Blob, returns fileUrl"&lt;/li&gt;
&lt;li&gt;"Add error handling" → "Catch these specific errors: 'Invalid API key', 'File too large', 'Unsupported format' — return custom messages for each"&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What Claude Code Did Right
&lt;/h2&gt;

&lt;p&gt;✅ &lt;strong&gt;Architectural coherence&lt;/strong&gt; — All components fit together perfectly without refactoring&lt;/p&gt;

&lt;p&gt;✅ &lt;strong&gt;Convention-following&lt;/strong&gt; — Respected Next.js 15 App Router best practices automatically&lt;/p&gt;

&lt;p&gt;✅ &lt;strong&gt;Error handling&lt;/strong&gt; — Implemented defensive code without explicit instruction&lt;/p&gt;

&lt;p&gt;✅ &lt;strong&gt;Type safety&lt;/strong&gt; — Generated proper TypeScript interfaces from the start&lt;/p&gt;

&lt;p&gt;✅ &lt;strong&gt;Debugging speed&lt;/strong&gt; — When I copy-pasted error messages, Claude Code fixed them instantly&lt;/p&gt;




&lt;h2&gt;
  
  
  The Real Lesson
&lt;/h2&gt;

&lt;p&gt;The bottleneck wasn't coding. It was &lt;strong&gt;communication&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  What Didn't Work
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Make it faster"
"Better UI design"
"Handle errors properly"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  What Worked
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Create a /api/dub route that:
- Accepts FormData with audio file
- Sends to ElevenLabs Scribe v1 with API key in headers
- Returns JSON: { text, duration, language }
- On error, catch these specific codes: 401, 413, 422
- Return status-appropriate HTTP responses&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Specificity = Speed.&lt;/p&gt;




&lt;h2&gt;
  
  
  Project Metrics
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Development Time&lt;/td&gt;
&lt;td&gt;8 hours&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lines of Code&lt;/td&gt;
&lt;td&gt;~2000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API Integrations&lt;/td&gt;
&lt;td&gt;3 (ElevenLabs, Google, NextAuth)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Supported Languages&lt;/td&gt;
&lt;td&gt;4 (Korean, English, Japanese, Spanish)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Max File Size&lt;/td&gt;
&lt;td&gt;100MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deployment Platform&lt;/td&gt;
&lt;td&gt;Vercel&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Time to First Deployment&lt;/td&gt;
&lt;td&gt;8 hours&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  What's Next: Lip-Sync
&lt;/h2&gt;

&lt;p&gt;The current version syncs audio length to video. But true cinematic dubbing requires &lt;strong&gt;lip-sync&lt;/strong&gt; — matching mouth movements to audio.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Challenge:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Extract phonemes from audio&lt;/li&gt;
&lt;li&gt;Detect facial landmarks in video&lt;/li&gt;
&lt;li&gt;Regenerate mouth shapes to match phonemes&lt;/li&gt;
&lt;li&gt;Merge back into original video&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Current Options:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Wav2Lip&lt;/strong&gt; (Open-source, free, but slow)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HeyGen API&lt;/strong&gt; (Fast, proprietary, paid)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sync Labs&lt;/strong&gt; (REST API, pricing TBD)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is the next mountain to climb.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. AI agents excel at iteration, not inspiration&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Claude Code couldn't dream up the idea&lt;/li&gt;
&lt;li&gt;But it implemented the dream faster than humanly possible&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;2. Specificity beats vagueness&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Vague prompts = vague results&lt;/li&gt;
&lt;li&gt;Detailed specs = precise implementation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;3. Error-driven development works&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When something breaks, copy-paste the error&lt;/li&gt;
&lt;li&gt;Claude Code diagnoses and fixes instantly&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;4. One feature at a time&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Requesting 5 features simultaneously causes conflicts&lt;/li&gt;
&lt;li&gt;Sequential requests = cleaner code&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;5. Always review security code&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AI-generated auth/security code should always be reviewed&lt;/li&gt;
&lt;li&gt;Never deploy untested auth systems&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Live Demo:&lt;/strong&gt; &lt;a href="https://ai-voice-bot-rose.vercel.app" rel="noopener noreferrer"&gt;https://ai-voice-bot-rose.vercel.app&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/hyosikkk/ai-voice-bot" rel="noopener noreferrer"&gt;https://github.com/hyosikkk/ai-voice-bot&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tech Stack:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Next.js 15&lt;/li&gt;
&lt;li&gt;ElevenLabs APIs&lt;/li&gt;
&lt;li&gt;Google Cloud Translation&lt;/li&gt;
&lt;li&gt;Turso Database&lt;/li&gt;
&lt;li&gt;Vercel Deployment&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The service works. The code is open-source. The journey was unforgettable.&lt;/p&gt;

&lt;p&gt;From zero to deployed in one day. With AI as my co-developer.&lt;/p&gt;

&lt;p&gt;That's the future of software development.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>api</category>
      <category>showdev</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
