DEV Community

Takara Taniguchi
Takara Taniguchi

Posted on

Video Instruction Tuning With Synthetic Data

Yuanhan Zhangが第一著者,Bytedanceのグループ

Intro
This dataset explores the traditional dataset by using synthetic data.
Authors created LLaVA-Video-178K tailored for video instruction following.

Related works(dataset)
ActivityNet
Charades
Kinetics-700
Something-Something v2
Ego4d
VIDAL
HD-VILA

Method
They used video detail descriptions pipeline.

Experiment

128 H100 GPUってどう言うことすか...😭

なんかあまり学習方法について書いていないのが残念

Top comments (0)