v2ray's picture
Adding Evaluation Results (#4)
42126cf
metadata
license: llama2
datasets:
  - WizardLM/WizardLM_evol_instruct_V2_196k
language:
  - en
pipeline_tag: conversational

LLaMA 2 Wizard 70B QLoRA

Fine tuned on WizardLM/WizardLM_evol_instruct_V2_196k dataset.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 65.4
ARC (25-shot) 67.58
HellaSwag (10-shot) 87.52
MMLU (5-shot) 69.11
TruthfulQA (0-shot) 61.79
Winogrande (5-shot) 82.32
GSM8K (5-shot) 30.48
DROP (3-shot) 59.03