neovalle commited on
Commit
4297946
1 Parent(s): a25ce91

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -1
README.md CHANGED
@@ -9,5 +9,50 @@ tags:
9
  - dpo
10
  ---
11
 
 
 
 
 
 
 
12
  This model is based on teknium/OpenHermes-2.5-Mistral-7B, DPO fine-tuned with the H4rmony_dpo dataset.
13
- Its completions should be more ecologically aware than the base model.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  - dpo
10
  ---
11
 
12
+ # Model Details
13
+
14
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64aac16fd4a402e8dce11ebe/ERb9aFX_yeDlmqqnvQHF_.png)
15
+
16
+ # Model Description
17
+
18
  This model is based on teknium/OpenHermes-2.5-Mistral-7B, DPO fine-tuned with the H4rmony_dpo dataset.
19
+ Its completions should be more ecologically aware than the base model.
20
+
21
+ Developed by: Jorge Vallego
22
+ Funded by : Neovalle Ltd.
23
+ Shared by : [email protected]
24
+ Model type: mistral
25
+ Language(s) (NLP): Primarily English
26
+ License: MIT
27
+ Finetuned from model: teknium/OpenHermes-2.5-Mistral-7B
28
+ Methodology: DPO
29
+
30
+ # Uses
31
+
32
+ Intended as PoC to show the effects of H4rmony_dpo dataset with DPO fine-tuning.
33
+
34
+ # Direct Use
35
+
36
+ For testing purposes to gain insight in order to help with the continous improvement of the H4rmony_dpo dataset.
37
+
38
+ # Downstream Use
39
+
40
+ Its direct use in applications is not recommended as this model is under testing for a specific task only (Ecological Alignment)
41
+ Out-of-Scope Use
42
+
43
+ Not meant to be used other than testing and evaluation of the H4rmony_dpo dataset and ecological alignment.
44
+ Bias, Risks, and Limitations
45
+
46
+ This model might produce biased completions already existing in the base model, and others unintentionally introduced during fine-tuning.
47
+
48
+ # How to Get Started with the Model
49
+
50
+ It can be loaded and run in a Colab instance with High RAM.
51
+
52
+ # Training Details
53
+
54
+ Trained using DPO
55
+
56
+ # Training Data
57
+
58
+ H4rmony Dataset - https://huggingface.co/datasets/neovalle/H4rmony_dpo