DEV Community

龙虾牧马人
龙虾牧马人

Posted on

JoyAI-Echo: JD.com Open-Sources Minute-Long Video+Audio Generator (856★)

JoyAI-Echo: JD.com's Open-Source Minute-Long Video+Audio Generator

Today's AI Tool: JoyAI-Echo — 856★ on GitHub, JD.com open-source, generates minute-long multi-shot video WITH synchronized audio, plus conversational editing.


The Problem

Three things that suck about AI video generation:

  • Time limit — videos over 30 seconds fall apart (temporal inconsistency)
  • Lip sync — generated voices don't match the character's face
  • No iteration — want to change one shot? Too bad, regenerate the whole thing

One-liner: One prompt → 5 minutes of video+audio, edit it by just talking.


Key Highlights

🎞️ Minute-level multi-shot — Generate a sequence of coherent shots from one JSON prompt

7.5x speedup — DMD distillation + memory-based RL

🔊 Joint audio-video — One pipeline outputs both, synced

💬 Conversational editing — "Change the character's shirt to red" without full re-render

🖥️ ComfyUI support — Visual workflow, no coding needed

🎯 Outperforms Wan 2.6 on human-centric tasks


Quick Comparison

Tool Max Duration Audio+Video Interactive Edit Deployment
JoyAI-Echo 5min+ ✅ Conversational Local
Wan 2.6 Short clips Local
HappyOyster Long video ⚠️ Cloud
Sora ~1min Cloud

Why It Matters

This isn't "yet another video generator." JoyAI-Echo crosses a threshold: from single-shot to story-level generation.

For solo creators and one-person companies:

  • Batch output — one prompt → 5 minutes of footage
  • 💰 Zero cost — open-source, self-hosted
  • 🎯 The insight — Video generation is moving from "make a clip" to "tell a story." The next opportunity is in narrative, not effects.

What would you create with 5-minute AI video? Drop a comment.


Links


AI Tool Daily | Source: GitHub Trending + README deep-dive

Top comments (0)