Edit model card

tulu-2-dpo-13b-ExPO

The extrapolated (ExPO) model based on allenai/tulu-2-dpo-13b and allenai/tulu-2-13b, as in the "Weak-to-Strong Extrapolation Expedites Alignment" paper.

Specifically, we obtain this model by extrapolating (alpha = 0.5) from the weights of the SFT and DPO/RLHF checkpoints, achieving superior alignment with human preference.

Evaluation Results

Evaluation results on the AlpacaEval 2.0 benchmark (you can find the evaluation outputs on the official GitHub repo):

Win Rate (Ori) LC Win Rate (Ori) Win Rate (+ ExPO) LC Win Rate (+ ExPO)
HuggingFaceH4/zephyr-7b-alpha 6.7% 10.0% 10.6% 13.6%
HuggingFaceH4/zephyr-7b-beta 10.2% 13.2% 11.1% 14.0%
berkeley-nest/Starling-LM-7B-alpha 15.0% 18.3% 18.2% 19.5%
Nexusflow/Starling-LM-7B-beta 26.6% 25.8% 29.6% 26.4%
snorkelai/Snorkel-Mistral-PairRM 24.7% 24.0% 28.8% 26.4%
RLHFlow/LLaMA3-iterative-DPO-final 29.2% 36.0% 32.7% 37.8%
internlm/internlm2-chat-1.8b 3.8% 4.0% 5.2% 4.3%
internlm/internlm2-chat-7b 20.5% 18.3% 28.1% 22.7%
internlm/internlm2-chat-20b 36.1% 24.9% 46.2% 27.2%
allenai/tulu-2-dpo-7b 8.5% 10.2% 11.5% 11.7%
allenai/tulu-2-dpo-13b 11.2% 15.5% 15.6% 17.6%
allenai/tulu-2-dpo-70b 15.4% 21.2% 23.0% 25.7%

Evaluation results on the MT-Bench benchmark (you can find the evaluation outputs on the official GitHub repo):

Original + ExPO
HuggingFaceH4/zephyr-7b-alpha 6.85 6.87
HuggingFaceH4/zephyr-7b-beta 7.02 7.06
berkeley-nest/Starling-LM-7B-alpha 7.82 7.91
Nexusflow/Starling-LM-7B-beta 8.10 8.18
snorkelai/Snorkel-Mistral-PairRM 7.63 7.69
RLHFlow/LLaMA3-iterative-DPO-final 8.08 8.45
internlm/internlm2-chat-1.8b 5.17 5.26
internlm/internlm2-chat-7b 7.72 7.80
internlm/internlm2-chat-20b 8.13 8.26
allenai/tulu-2-dpo-7b 6.35 6.38
allenai/tulu-2-dpo-13b 7.00 7.26
allenai/tulu-2-dpo-70b 7.79 8.03
Downloads last month
27
Safetensors
Model size
13B params
Tensor type
BF16
Β·
Inference Examples
Inference API (serverless) is not available, repository is disabled.

Model tree for chujiezheng/tulu-2-dpo-13b-ExPO

Quantizations
1 model

Spaces using chujiezheng/tulu-2-dpo-13b-ExPO 3

Collection including chujiezheng/tulu-2-dpo-13b-ExPO