Update README.md
Browse files
README.md
CHANGED
@@ -9,5 +9,50 @@ tags:
|
|
9 |
- dpo
|
10 |
---
|
11 |
|
|
|
|
|
|
|
|
|
|
|
|
|
12 |
This model is based on teknium/OpenHermes-2.5-Mistral-7B, DPO fine-tuned with the H4rmony_dpo dataset.
|
13 |
-
Its completions should be more ecologically aware than the base model.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
9 |
- dpo
|
10 |
---
|
11 |
|
12 |
+
# Model Details
|
13 |
+
|
14 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64aac16fd4a402e8dce11ebe/ERb9aFX_yeDlmqqnvQHF_.png)
|
15 |
+
|
16 |
+
# Model Description
|
17 |
+
|
18 |
This model is based on teknium/OpenHermes-2.5-Mistral-7B, DPO fine-tuned with the H4rmony_dpo dataset.
|
19 |
+
Its completions should be more ecologically aware than the base model.
|
20 |
+
|
21 |
+
Developed by: Jorge Vallego
|
22 |
+
Funded by : Neovalle Ltd.
|
23 |
+
Shared by : [email protected]
|
24 |
+
Model type: mistral
|
25 |
+
Language(s) (NLP): Primarily English
|
26 |
+
License: MIT
|
27 |
+
Finetuned from model: teknium/OpenHermes-2.5-Mistral-7B
|
28 |
+
Methodology: DPO
|
29 |
+
|
30 |
+
# Uses
|
31 |
+
|
32 |
+
Intended as PoC to show the effects of H4rmony_dpo dataset with DPO fine-tuning.
|
33 |
+
|
34 |
+
# Direct Use
|
35 |
+
|
36 |
+
For testing purposes to gain insight in order to help with the continous improvement of the H4rmony_dpo dataset.
|
37 |
+
|
38 |
+
# Downstream Use
|
39 |
+
|
40 |
+
Its direct use in applications is not recommended as this model is under testing for a specific task only (Ecological Alignment)
|
41 |
+
Out-of-Scope Use
|
42 |
+
|
43 |
+
Not meant to be used other than testing and evaluation of the H4rmony_dpo dataset and ecological alignment.
|
44 |
+
Bias, Risks, and Limitations
|
45 |
+
|
46 |
+
This model might produce biased completions already existing in the base model, and others unintentionally introduced during fine-tuning.
|
47 |
+
|
48 |
+
# How to Get Started with the Model
|
49 |
+
|
50 |
+
It can be loaded and run in a Colab instance with High RAM.
|
51 |
+
|
52 |
+
# Training Details
|
53 |
+
|
54 |
+
Trained using DPO
|
55 |
+
|
56 |
+
# Training Data
|
57 |
+
|
58 |
+
H4rmony Dataset - https://huggingface.co/datasets/neovalle/H4rmony_dpo
|