leaderboard-pr-bot's picture
Adding Evaluation Results
6939902
|
raw
history blame
1.8 kB
metadata
license: apache-2.0
language:
  - en
  - de
  - es
  - fr
tags:
  - sft
inference: false
datasets:
  - OpenAssistant/oasst1

Open-Assistant Llama2 70B SFT OASST

This model is a fine-tuning of Llama2 70B LLM. It was trained on a mixture of OASST top-1 threads.

Model Details

  • Finetuned from: Llama2 70B
  • Model type: Causal decoder-only transformer language model
  • Language: English, German, Spanish, French (and limited capabilities in Italian, Portuguese, Polish, Dutch, Romanian, Czech, Swedish);
  • License: Apache 2.0
  • Contact: Open-Assistant Discord

Prompting

Two special tokens are used to mark the beginning of user and assistant turns: <|prompter|> and <|assistant|>. Each turn ends with a </s> token.

Input prompt example:

<|prompter|>What is a meme, and what's the history behind this word?</s><|assistant|>

The input ends with the <|assistant|> token to signal that the model should start generating the assistant reply.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 57.11
ARC (25-shot) 67.66
HellaSwag (10-shot) 87.24
MMLU (5-shot) 69.95
TruthfulQA (0-shot) 51.28
Winogrande (5-shot) 84.14
GSM8K (5-shot) 32.75
DROP (3-shot) 6.73