leaderboard-pr-bot commited on
Commit
8271a07
1 Parent(s): 41a3227

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +16 -3
README.md CHANGED
@@ -1,9 +1,9 @@
1
  ---
2
- base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
3
- library_name: peft
4
  license: llama3.1
 
5
  tags:
6
  - generated_from_trainer
 
7
  model-index:
8
  - name: flippa-v6
9
  results: []
@@ -70,4 +70,17 @@ The following hyperparameters were used during training:
70
  - Transformers 4.44.2
71
  - Pytorch 2.3.1+cu121
72
  - Datasets 2.21.0
73
- - Tokenizers 0.19.1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
 
2
  license: llama3.1
3
+ library_name: peft
4
  tags:
5
  - generated_from_trainer
6
+ base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
7
  model-index:
8
  - name: flippa-v6
9
  results: []
 
70
  - Transformers 4.44.2
71
  - Pytorch 2.3.1+cu121
72
  - Datasets 2.21.0
73
+ - Tokenizers 0.19.1
74
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
75
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_carsenk__flippa-v6)
76
+
77
+ | Metric |Value|
78
+ |-------------------|----:|
79
+ |Avg. |20.55|
80
+ |IFEval (0-Shot) |34.39|
81
+ |BBH (3-Shot) |29.99|
82
+ |MATH Lvl 5 (4-Shot)|12.69|
83
+ |GPQA (0-shot) | 5.70|
84
+ |MuSR (0-shot) |10.88|
85
+ |MMLU-PRO (5-shot) |29.64|
86
+