traversaal-ai
commited on
Commit
•
f5403d7
1
Parent(s):
d3a68da
Update README.md
Browse files
README.md
CHANGED
@@ -2,4 +2,11 @@
|
|
2 |
license: apache-2.0
|
3 |
datasets:
|
4 |
- Intel/orca_dpo_pairs
|
5 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
license: apache-2.0
|
3 |
datasets:
|
4 |
- Intel/orca_dpo_pairs
|
5 |
+
---
|
6 |
+
|
7 |
+
traversaal-2.5-Mistral-7B is trained via Direct Preference Optimization(DPO) from teknium/OpenHermes-2.5-Mistral-7B as its base model, with several optimizations in hyperparameters.
|
8 |
+
teknium/OpenHermes-2.5-Mistral-7B is trained via Supervised Fine-Tuning (SFT) using LoRA, with the QWEN-72B model as its base-model.
|
9 |
+
Note that we did not exploit any form of weight merge.
|
10 |
+
For leaderboard submission, the trained weight is realigned for compatibility with Mistral-7b
|
11 |
+
|
12 |
+
|