leaderboard-pr-bot's picture
Adding Evaluation Results
3e45826
|
raw
history blame
No virus
826 Bytes
metadata
license: apache-2.0

Model is instruction-finetuned using Open-Platypus dataset: https://huggingface.co/datasets/garage-bAInd/Open-Platypus

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 53.64
ARC (25-shot) 62.37
HellaSwag (10-shot) 85.08
MMLU (5-shot) 63.79
TruthfulQA (0-shot) 47.33
Winogrande (5-shot) 77.66
GSM8K (5-shot) 17.29
DROP (3-shot) 21.93