Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
Junming Yang
commited on
Commit
•
20d077e
1
Parent(s):
e2f94e9
add VQA meta_data
Browse files- meta_data.py +21 -0
meta_data.py
CHANGED
@@ -157,4 +157,25 @@ LEADERBOARD_MD['RealWorldQA'] = """
|
|
157 |
## RealWorldQA Evaluation Results
|
158 |
|
159 |
- RealWorldQA is a benchmark designed to evaluate the real-world spatial understanding capabilities of multimodal AI models, contributed by XAI. It assesses how well these models comprehend physical environments. The benchmark consists of 700+ images, each accompanied by a question and a verifiable answer. These images are drawn from real-world scenarios, including those captured from vehicles. The goal is to advance AI models' understanding of our physical world.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
160 |
"""
|
|
|
157 |
## RealWorldQA Evaluation Results
|
158 |
|
159 |
- RealWorldQA is a benchmark designed to evaluate the real-world spatial understanding capabilities of multimodal AI models, contributed by XAI. It assesses how well these models comprehend physical environments. The benchmark consists of 700+ images, each accompanied by a question and a verifiable answer. These images are drawn from real-world scenarios, including those captured from vehicles. The goal is to advance AI models' understanding of our physical world.
|
160 |
+
"""
|
161 |
+
|
162 |
+
LEADERBOARD_MD['TextVQA_VAL'] = """
|
163 |
+
## TextVQA Evaluation Results
|
164 |
+
|
165 |
+
- TextVQA is a dataset to benchmark visual reasoning based on text in images. TextVQA requires models to read and reason about text in images to answer questions about them. Specifically, models need to incorporate a new modality of text present in the images and reason over it to answer TextVQA questions.
|
166 |
+
- Note that some models may not be able to generate standardized responses based on the prompt. We currently do not have reports for these models.
|
167 |
+
"""
|
168 |
+
|
169 |
+
LEADERBOARD_MD['ChartQA_TEST'] = """
|
170 |
+
## ChartQA Evaluation Results
|
171 |
+
|
172 |
+
- ChartQA is a benchmark for question answering about charts with visual and logical reasoning.
|
173 |
+
- Note that some models may not be able to generate standardized responses based on the prompt. We currently do not have reports for these models.
|
174 |
+
"""
|
175 |
+
|
176 |
+
LEADERBOARD_MD['OCRVQA_TESTCORE'] = """
|
177 |
+
## OCRVQA Evaluation Results
|
178 |
+
|
179 |
+
- OCRVQA is a benchmark for visual question answering by reading text in images. It presents a large-scale dataset, OCR-VQA-200K, comprising over 200,000 images of book covers. The study combines techniques from the Optical Character Recognition (OCR) and Visual Question Answering (VQA) domains to address the challenges associated with this new task and dataset.
|
180 |
+
- Note that some models may not be able to generate standardized responses based on the prompt. We currently do not have reports for these models.
|
181 |
"""
|