[memo] MotionLLM: Understanding Human Behaviors from Human Motions and Videos

Video and motion modalities
Diverging from recent LLMs
Video-only or motion-only understanding
SMPL sequences
Capture nuanced body part dynaics and semantics
Video-motion training strategy

They propose Movid-Bench, with carefully manual annotations.

Introduction
Understanding human behavior

Contribution
To relieve the scarcity of data issues,  MoVid with diverse caption and instruction annotations from motion/video datasets
Invented a training strategy.
For better evaluation, they carefully construct a Movid-Bench.

Related works
Human motion understanding aims to extract the semantics of human motion

Conclusion
MotionLLM: a unified framework for human behavior understanding
Limitation: limited capacity of the video encoder. Negative use.

Application: fitness coach.

DEV Community

[memo] MotionLLM: Understanding Human Behaviors from Human Motions and Videos

Top comments (0)