lole25 commited on
Commit
1106fd7
1 Parent(s): d1bd63c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -1
README.md CHANGED
@@ -1,5 +1,22 @@
1
  ---
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: apache-2.0
3
  ---
4
 
5
- Training Zephyr-7B with DPO.
 
 
 
 
 
 
1
  ---
2
+ library_name: peft
3
+ tags:
4
+ - alignment-handbook
5
+ - generated_from_trainer
6
+ - trl
7
+ - dpo
8
+ base_model: DUAL-GPO/zephyr-7b-dpo-new-lora-v1-merged
9
+ datasets:
10
+ - HuggingFaceH4/ultrafeedback_binarized
11
+ model-index:
12
+ - name: zephyr-7b-dpo-0k-15k-i1
13
+ results: []
14
  license: apache-2.0
15
  ---
16
 
17
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
18
+ should probably proofread and complete it, then remove this comment. -->
19
+
20
+ # zephyr-7b-dpo-0k-15k-i1
21
+
22
+ This model is a fine-tuned version of [DUAL-GPO/zephyr-7b-dpo-new-lora-v1-merged](https://huggingface.co/DUAL-GPO/zephyr-7b-dpo-new-lora-v1-merged) on the HuggingFaceH4/ultrafeedback_binarized dataset.