metadata
language:
- en
license: other
library_name: transformers
tags:
- orpo
- llama 3
datasets:
- mlabonne/orpo-dpo-mix-40k
OrpoLlama-3-8B
This is a quick fine-tune of meta-llama/Meta-Llama-3-8B on 1k samples of mlabonne/orpo-dpo-mix-40k created for this article.
It's not very good at the moment (it's the sassiest model ever), but I'm currently training a version on the entire dataset.
Try the demo: https://huggingface.co/spaces/mlabonne/OrpoLlama-3-8B
π Evaluation
Nous
Evaluation performed using LLM AutoEval, see the entire leaderboard here.
Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench |
---|---|---|---|---|---|
teknium/OpenHermes-2.5-Mistral-7B π | 52.42 | 42.75 | 72.99 | 52.99 | 40.94 |
meta-llama/Meta-Llama-3-8B-Instruct π | 51.34 | 41.22 | 69.86 | 51.65 | 42.64 |
mistralai/Mistral-7B-Instruct-v0.1 π | 49.15 | 33.36 | 67.87 | 55.89 | 39.48 |
mlabonne/OrpoLlama-3-8B π | 46.76 | 31.56 | 70.19 | 48.11 | 37.17 |
meta-llama/Meta-Llama-3-8B π | 45.42 | 31.1 | 69.95 | 43.91 | 36.7 |
π Training curves
π» Usage
!pip install -qU transformers accelerate
from transformers import AutoTokenizer
import transformers
import torch
model = "mlabonne/OrpoLlama-3-8B"
messages = [{"role": "user", "content": "What is a large language model?"}]
tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
"text-generation",
model=model,
torch_dtype=torch.float16,
device_map="auto",
)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])