VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models

#ai #machinelearning #softwareengineering

Yuxan Wangが第一著者,北京一般人工知能機関

The problem is as follows
The majority of current models exhibit significant issues with hallucinations
Detecting extrinsic hallucinations is difficult
Existing models are better at detecting facts than identifying hallucinations.

Related works
Existing methods do not focus on dynamic content.
actions, events, stories
Issues of LVLMs are specified in VideoHallucer.

Intrinsic
Object relation
Temporal
Semantic detail

Extrinsic
Factual
Non-factual

Experiment and benchmark
Self-PEP Framework
This seems to be one kind of CoTs

Conclusion
Adversarially generated questions from videos
VQA based and Caption based

DEV Community

VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models

Top comments (0)