OpenCompass

community

https://opencompass.org.cn/

Request to join this org

AI & ML interests

None defined yet.

Organization Card

Community About org cards

OpenCompass Website ^HOT OpenCompass Toolkit ^{TRY IT OUT}

👋 join us on Discord and WeChat

follow us on Github

OpenCompass is a platform focused on evaluation of AGI, include Large Language Model and Multi-modality Model. We aim to:

develop high-quality libraries to reduce the difficulties in evaluation
provide convincing leaderboards for improving the understanding of the large models
create powerful toolchains targeting a variety of abilities and tasks
build solid benchmarks to support the large model research

Collections 1

spaces 8

Running on CPU Upgrade

Open VLM Leaderboard

VLMEvalKit Evaluation Results Collection

CompassJudger Subjective Evaluation Learderboard

CompassJudger Subjective Evaluation Learderboard

JudgerBench Leaderboard

JudgerBench Leaderboard

Open VLM Video Leaderboard

VLMEvalKit Eval Results in video understanding benchmark

MMBench Leaderboard

OpenCompass LLM Leaderboard

models 8

opencompass/CompassJudger-1-14B-Instruct

Text Generation • Updated 10 days ago • 88 • 1

opencompass/CompassJudger-1-32B-Instruct

Text Generation • Updated 10 days ago • 394 • 8

opencompass/anah-v2

Text Generation • Updated 11 days ago • 21 • 2

opencompass/CompassJudger-1-1.5B-Instruct

Updated 19 days ago • 93 • 1

opencompass/CompassJudger-1-7B-Instruct

Updated 19 days ago • 186 • 2

opencompass/anah-7b

Text Generation • Updated Jul 3 • 26

opencompass/anah-20b

Text Generation • Updated Jul 3 • 13

opencompass/mixtral-8x7b-32k

Updated Dec 10, 2023 • 1

datasets 7

opencompass/mmmlu_lite

Viewer • Updated 9 days ago • 20k • 35 • 2

opencompass/MMBench-Video

Preview • Updated Oct 9 • 289 • 6

opencompass/NeedleBench

Viewer • Updated Jul 26 • 524 • 4.07k • 2

opencompass/anah

Viewer • Updated Jul 3 • 783 • 95 • 2

opencompass/flames

Viewer • Updated Apr 22 • 537 • 54

opencompass/CriticBench

Updated Feb 23 • 196 • 4

opencompass/MMBench

Updated Sep 13, 2023 • 38 • 1