Weyaxi commited on
Commit
f60edf6
1 Parent(s): 7e180a1

add eval results

Browse files
Files changed (1) hide show
  1. README.md +12 -0
README.md CHANGED
@@ -56,4 +56,16 @@ dtype: float16
56
 
57
  ```
58
 
 
59
 
 
 
 
 
 
 
 
 
 
 
 
 
56
 
57
  ```
58
 
59
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
60
 
61
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_PulsarAI__OpenHermes-2.5-neural-chat-v3-2-Slerp)
62
+
63
+ | Metric | Value |
64
+ |-----------------------|---------------------------|
65
+ | Avg. | 70.2 |
66
+ | ARC (25-shot) | 67.49 |
67
+ | HellaSwag (10-shot) | 85.42 |
68
+ | MMLU (5-shot) | 64.13 |
69
+ | TruthfulQA (0-shot) | 61.05 |
70
+ | Winogrande (5-shot) | 80.3 |
71
+ | GSM8K (5-shot) | 63.08 |