File size: 528 Bytes
d3a68da f5403d7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 |
---
license: apache-2.0
datasets:
- Intel/orca_dpo_pairs
---
traversaal-2.5-Mistral-7B is trained via Direct Preference Optimization(DPO) from teknium/OpenHermes-2.5-Mistral-7B as its base model, with several optimizations in hyperparameters.
teknium/OpenHermes-2.5-Mistral-7B is trained via Supervised Fine-Tuning (SFT) using LoRA, with the QWEN-72B model as its base-model.
Note that we did not exploit any form of weight merge.
For leaderboard submission, the trained weight is realigned for compatibility with Mistral-7b
|