DEV Community

Jason Corso
Jason Corso

Posted on

πŸŽ₯πŸ–πŸŽ₯πŸ– New Video GenAI with Better Rendering of Hands --> Instructional Video Generation πŸŽ₯πŸ–πŸŽ₯πŸ–

New Paper Alert

Instructional Video Generation – we are releasing a new method for Video Generation that explicitly focuses on fine-grained, subtle hand motions.Β Given a single image frame as context and a text prompt for an action, our new method generates high quality videos with careful attention to hand rendering.Β We use the instructional video domain as driver here given the rich set of videos and challenges in instructional videos both for humans and robots.

Try it out yourself.Β Links to the paper, project page and code are below; and a demo page on HuggingFace is in the works so you can more easily try it on your own.

Our new method generates instructional videos tailored to your room, your tools, and your perspective. Whether it’s threading a needle or rolling dough, the video shows exactly how you would do it, preserving your environment while guiding you frame-by-frame. The key breakthrough is in mastering accurate subtle fingertip actionsβ€”the exact fine details that matter most in action completion. By designing automatic Region of Motion (RoM) generation and a hand structure loss for fine-grained fingertip movements, our diffusion-based im model outperforms six state-of-the-art video generation methods, bringing unparalleled clarity to Video GenAI.

πŸ‘‰ Project Page: https://excitedbutter.github.io/project_page/
πŸ‘‰ Paper Link: https://arxiv.org/abs/2412.04189
πŸ‘‰ GitHub Repo: https://github.com/ExcitedButter/Instructional-Video-Generation-IVG

This paper is coauthored with my students Yayuan Li and Zhi Cao at the University of Michigan and Voxel51.

Top comments (1)

Collapse
 
yayuanli profile image
Yayuan Li

Thank you, Dr. Corso, and thank you to the community for your attention. We welcome any comments and feedback!