Lillianwei
commited on
Commit
β’
e53fd38
1
Parent(s):
beb48c6
Update README.md
Browse files
README.md
CHANGED
@@ -40,7 +40,7 @@ We introduce **MMIE**, a robust, knowledge-intensive benchmark to evaluate inter
|
|
40 |
2. **π Challenging the Best**: Even top models like **GPT-4o + SDXL** peak at 65.47%, highlighting room for growth in LVLMs.
|
41 |
3. **π Designed for Interleaved Tasks**: The benchmark supports evaluation across both text and image comprehension with both **multiple-choice and open-ended** formats.
|
42 |
|
43 |
-
---
|
44 |
|
45 |
### π§ Dataset Details
|
46 |
<div align="center">
|
@@ -48,3 +48,4 @@ We introduce **MMIE**, a robust, knowledge-intensive benchmark to evaluate inter
|
|
48 |
</div>
|
49 |
|
50 |
MMIE is curated to evaluate models' comprehensive abilities in interleaved multimodal comprehension and generation. The dataset features diverse examples, categorized and distributed across different fields as illustrated above. This ensures balanced coverage across various domains of interleaved input/output tasks, supporting accurate and detailed model evaluations.
|
|
|
|
40 |
2. **π Challenging the Best**: Even top models like **GPT-4o + SDXL** peak at 65.47%, highlighting room for growth in LVLMs.
|
41 |
3. **π Designed for Interleaved Tasks**: The benchmark supports evaluation across both text and image comprehension with both **multiple-choice and open-ended** formats.
|
42 |
|
43 |
+
<!-- ---
|
44 |
|
45 |
### π§ Dataset Details
|
46 |
<div align="center">
|
|
|
48 |
</div>
|
49 |
|
50 |
MMIE is curated to evaluate models' comprehensive abilities in interleaved multimodal comprehension and generation. The dataset features diverse examples, categorized and distributed across different fields as illustrated above. This ensures balanced coverage across various domains of interleaved input/output tasks, supporting accurate and detailed model evaluations.
|
51 |
+
-->
|