-
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
Paper • 2406.04325 • Published • 71 -
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models
Paper • 2401.15947 • Published • 49 -
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Paper • 2311.10122 • Published • 26 -
Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models
Paper • 2311.16103 • Published • 1
Collections
Discover the best community collections!
Collections including paper arxiv:2406.04325
-
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
Paper • 2406.04325 • Published • 71 -
SF-V: Single Forward Video Generation Model
Paper • 2406.04324 • Published • 23 -
VideoTetris: Towards Compositional Text-to-Video Generation
Paper • 2406.04277 • Published • 22 -
Vript: A Video Is Worth Thousands of Words
Paper • 2406.06040 • Published • 22
-
Vript: A Video Is Worth Thousands of Words
Paper • 2406.06040 • Published • 22 -
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
Paper • 2406.04325 • Published • 71 -
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark
Paper • 2406.01574 • Published • 42 -
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Paper • 2405.21075 • Published • 18
-
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning
Paper • 2406.06469 • Published • 23 -
Mixture-of-Agents Enhances Large Language Model Capabilities
Paper • 2406.04692 • Published • 55 -
CRAG -- Comprehensive RAG Benchmark
Paper • 2406.04744 • Published • 41 -
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
Paper • 2406.04325 • Published • 71
-
LanguageBind/MoE-LLaVA-Phi2-2.7B-4e
Text Generation • Updated • 58.7k • 38 -
LanguageBind/LanguageBind_Video_FT
Zero-Shot Image Classification • Updated • 593k • 4 -
stabilityai/stable-video-diffusion-img2vid-xt
Image-to-Video • Updated • 466k • 2.67k -
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
Paper • 2406.04325 • Published • 71