mlabonne
/

OrpoLlama-3-8B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

OrpoLlama-3-8B / README.md

mlabonne's picture

Upload LlamaForCausalLM

0ed5882 verified 7 months ago

|

3.28 kB

	---
	language:
	- en
	license: other
	library_name: transformers
	tags:
	- orpo
	- llama 3
	datasets:
	- mlabonne/orpo-dpo-mix-40k
	---

	# OrpoLlama-3-8B

	![](https://i.imgur.com/ZHwzQvI.png)

	This is a quick fine-tune of [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) on 1k samples of [mlabonne/orpo-dpo-mix-40k](https://huggingface.co/datasets/mlabonne/orpo-dpo-mix-40k) created for [this article](https://huggingface.co/blog/mlabonne/orpo-llama-3).

	It's not very good at the moment (it's the sassiest model ever), but I'm currently training a version on the entire dataset.

	Try the demo: https://huggingface.co/spaces/mlabonne/OrpoLlama-3-8B

	## 🏆 Evaluation

	### Nous

	Evaluation performed using [LLM AutoEval](https://github.com/mlabonne/llm-autoeval), see the entire leaderboard [here](https://huggingface.co/spaces/mlabonne/Yet_Another_LLM_Leaderboard).

	\| Model \| Average \| AGIEval \| GPT4All \| TruthfulQA \| Bigbench \|
	\| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- \| --------: \| --------: \| --------: \| ---------: \| --------: \|
	\| [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) [📄](https://gist.github.com/mlabonne/88b21dd9698ffed75d6163ebdc2f6cc8) \| 52.42 \| 42.75 \| 72.99 \| 52.99 \| 40.94 \|
	\| [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) [📄](https://gist.github.com/mlabonne/8329284d86035e6019edb11eb0933628) \| 51.34 \| 41.22 \| 69.86 \| 51.65 \| 42.64 \|
	\| [mistralai/Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1) [📄](https://gist.github.com/mlabonne/7a0446c3d30dfce72834ef780491c4b2) \| 49.15 \| 33.36 \| 67.87 \| 55.89 \| 39.48 \|
	\| [mlabonne/OrpoLlama-3-8B](https://huggingface.co/mlabonne/OrpoLlama-3-8B) [📄](https://gist.github.com/mlabonne/f41dad371d1781d0434a4672fd6f0b82) \| 46.76 \| 31.56 \| 70.19 \| 48.11 \| 37.17 \|
	\| [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) [📄](https://gist.github.com/mlabonne/616b6245137a9cfc4ea80e4c6e55d847) \| 45.42 \| 31.1 \| 69.95 \| 43.91 \| 36.7 \|

	## 📈 Training curves

	![](https://i.imgur.com/r78hGrl.png)

	## 💻 Usage

	```python
	!pip install -qU transformers accelerate

	from transformers import AutoTokenizer
	import transformers
	import torch

	model = "mlabonne/OrpoLlama-3-8B"
	messages = [{"role": "user", "content": "What is a large language model?"}]

	tokenizer = AutoTokenizer.from_pretrained(model)
	prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
	pipeline = transformers.pipeline(
	"text-generation",
	model=model,
	torch_dtype=torch.float16,
	device_map="auto",
	)

	outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
	print(outputs[0]["generated_text"])
	```