VLM Benchmarks - a marcusinthesky Collection

marcusinthesky 's Collections

DS

Open-vocabulary object detection (OVD).

Multi-modal Mamba

Multimodal Embeddings

Tiny VLM Decoder

PeFT

Decoder Upcycled to Embeddings

VLM Benchmarks

updated 23 days ago

MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models

Paper • 2410.10139 • Published 24 days ago • 50
MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks

Paper • 2410.10563 • Published 23 days ago • 36
LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content

Paper • 2410.10783 • Published 23 days ago • 25
TVBench: Redesigning Video-Language Evaluation

Paper • 2410.07752 • Published 28 days ago • 5