DEV Community

Jason Corso
Jason Corso

Posted on

1

πŸŽ₯πŸ–πŸŽ₯πŸ– New Video GenAI with Better Rendering of Hands --> Instructional Video Generation πŸŽ₯πŸ–πŸŽ₯πŸ–

New Paper Alert

Instructional Video Generation – we are releasing a new method for Video Generation that explicitly focuses on fine-grained, subtle hand motions. Given a single image frame as context and a text prompt for an action, our new method generates high quality videos with careful attention to hand rendering. We use the instructional video domain as driver here given the rich set of videos and challenges in instructional videos both for humans and robots.

Try it out yourself. Links to the paper, project page and code are below; and a demo page on HuggingFace is in the works so you can more easily try it on your own.

Our new method generates instructional videos tailored to your room, your tools, and your perspective. Whether it’s threading a needle or rolling dough, the video shows exactly how you would do it, preserving your environment while guiding you frame-by-frame. The key breakthrough is in mastering accurate subtle fingertip actionsβ€”the exact fine details that matter most in action completion. By designing automatic Region of Motion (RoM) generation and a hand structure loss for fine-grained fingertip movements, our diffusion-based im model outperforms six state-of-the-art video generation methods, bringing unparalleled clarity to Video GenAI.

πŸ‘‰ Project Page: https://excitedbutter.github.io/project_page/
πŸ‘‰ Paper Link: https://arxiv.org/abs/2412.04189
πŸ‘‰ GitHub Repo: https://github.com/ExcitedButter/Instructional-Video-Generation-IVG

This paper is coauthored with my students Yayuan Li and Zhi Cao at the University of Michigan and Voxel51.

Image of Timescale

πŸš€ pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applicationsβ€”without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

Read more β†’

Top comments (2)

Collapse
 
yayuanli profile image
Yayuan Li β€’

Importantly, as this video shows, our proposed Hand Structure Loss is critical to generate accurate and realistic fingertip subtle actions. See video demonstrations here: excitedbutter.github.io/project_pa...

Collapse
 
yayuanli profile image
Yayuan Li β€’

Thank you, Dr. Corso, and thank you to the community for your attention. We welcome any comments and feedback!

Image of Docusign

πŸ› οΈ Bring your solution into Docusign. Reach over 1.6M customers.

Docusign is now extensible. Overcome challenges with disconnected products and inaccessible data by bringing your solutions into Docusign and publishing to 1.6M customers in the App Center.

Learn more