DEV Community

yongYong
yongYong

Posted on

Orchestrating AI Agents to Build a Multi-Lingual Video Dubbing Workspace

I built Vonnect AI, an AI Dubbing Workspace that generates multi-lingual dubs while cloning the original speaker's voice. While the AI output is satisfying, the real story is the build process: a Multi-Agent AI workflow orchestrated through Google Antigravity.

🛠 The AI Stack

Voice Cloning: Using ElevenLabs, the app identifies multiple speakers and clones each voice to maintain authenticity across different languages.
Smart Re-dub: Users can edit translations and re-synthesize specific segments instantly, powered by a combination of Next.js and Drizzle ORM.

🤖 Multi-Model Orchestration

Instead of sticking to one AI, I strategically distributed tasks among different SOTA models:

Claude 4.6 Sonnet: Handled the complex backend and API orchestration.
Gemini 3.1 Pro: Acted as a high-level troubleshooter, analyzing deep logic when thing broke.
Gemini 3 Flash: Rapidly iterated on UI and documentation.

💡 The Selective Approval Pattern

The power of my workflow came from the Selective Approval mechanism in Google Antigravity. I never accepted code changes blindly. Each proposal was reviewed, ensuring that as a human developer, I remained the source of truth for the codebase. This "Human-on-the-loop" strategy allowed me to ship a feature-rich product at an incredible pace without sacrificing code quality.

GitHub Repo: https://github.com/Jin1370/vonnect_ai

Top comments (0)