pankajmathur
commited on
Commit
•
e987f7e
1
Parent(s):
0ad4395
Update README.md
Browse files
README.md
CHANGED
@@ -22,14 +22,17 @@ We evaluated model_009 on a wide range of tasks using [Language Model Evaluation
|
|
22 |
|
23 |
Here are the results on metrics used by [HuggingFaceH4 Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
24 |
|
25 |
-
|
26 |
-
|
27 |
-
|**Task**|**
|
28 |
-
|*
|
29 |
-
|*
|
30 |
-
|*
|
31 |
-
|*
|
32 |
-
|
|
|
|
|
|
|
33 |
|
34 |
|
35 |
## Example Usage
|
|
|
22 |
|
23 |
Here are the results on metrics used by [HuggingFaceH4 Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
24 |
|
25 |
+
|||
|
26 |
+
|:------:|:-------:|
|
27 |
+
|**Task**|**Value**|
|
28 |
+
|*ARC*|0.7159|
|
29 |
+
|*HellaSwag*|0.8771|
|
30 |
+
|*MMLU*|0.6943|
|
31 |
+
|*TruthfulQA*|0.6072|
|
32 |
+
|*Winogrande*|0.8232|
|
33 |
+
|*GSM8k*|0.3942|
|
34 |
+
|*DROP*|0.4401|
|
35 |
+
|**Total Average**|**0.6503**|
|
36 |
|
37 |
|
38 |
## Example Usage
|