beamaia commited on
Commit
5f375f4
1 Parent(s): 52d2213

Model save

Browse files
Files changed (1) hide show
  1. README.md +99 -0
README.md ADDED
@@ -0,0 +1,99 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ library_name: peft
4
+ tags:
5
+ - trl
6
+ - kto
7
+ - generated_from_trainer
8
+ base_model: HuggingFaceH4/zephyr-7b-beta
9
+ model-index:
10
+ - name: WeniGPT-Agents-Zephyr-1.0.25-KTO
11
+ results: []
12
+ ---
13
+
14
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
+ should probably proofread and complete it, then remove this comment. -->
16
+
17
+ # WeniGPT-Agents-Zephyr-1.0.25-KTO
18
+
19
+ This model is a fine-tuned version of [HuggingFaceH4/zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) on the None dataset.
20
+ It achieves the following results on the evaluation set:
21
+ - Loss: 0.5
22
+ - Rewards/chosen: -195.8677
23
+ - Rewards/rejected: -165.2624
24
+ - Rewards/margins: -30.6053
25
+ - Kl: 0.0
26
+ - Logps/chosen: -2238.1643
27
+ - Logps/rejected: -1890.7997
28
+
29
+ ## Model description
30
+
31
+ More information needed
32
+
33
+ ## Intended uses & limitations
34
+
35
+ More information needed
36
+
37
+ ## Training and evaluation data
38
+
39
+ More information needed
40
+
41
+ ## Training procedure
42
+
43
+ ### Training hyperparameters
44
+
45
+ The following hyperparameters were used during training:
46
+ - learning_rate: 0.0002
47
+ - train_batch_size: 4
48
+ - eval_batch_size: 4
49
+ - seed: 42
50
+ - gradient_accumulation_steps: 4
51
+ - total_train_batch_size: 16
52
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
53
+ - lr_scheduler_type: linear
54
+ - lr_scheduler_warmup_ratio: 0.03
55
+ - training_steps: 1470
56
+ - mixed_precision_training: Native AMP
57
+
58
+ ### Training results
59
+
60
+ | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/margins | Kl | Logps/chosen | Logps/rejected |
61
+ |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:---------------:|:---:|:------------:|:--------------:|
62
+ | 0.6966 | 0.33 | 50 | 0.5063 | -13.4129 | -12.4081 | -1.0049 | 0.0 | -413.6161 | -362.2560 |
63
+ | 0.754 | 0.66 | 100 | 0.5000 | -174.7515 | -145.9646 | -28.7869 | 0.0 | -2027.0018 | -1697.8218 |
64
+ | 0.6274 | 0.99 | 150 | 0.5 | -195.8329 | -165.1599 | -30.6730 | 0.0 | -2237.8149 | -1889.7742 |
65
+ | 0.642 | 1.32 | 200 | 0.5000 | -195.1430 | -164.6777 | -30.4653 | 0.0 | -2230.9163 | -1884.9520 |
66
+ | 0.6241 | 1.65 | 250 | 0.5000 | -195.1471 | -164.6848 | -30.4623 | 0.0 | -2230.9573 | -1885.0226 |
67
+ | 0.7477 | 1.98 | 300 | 0.5 | -195.8677 | -165.2624 | -30.6053 | 0.0 | -2238.1643 | -1890.7997 |
68
+ | 0.8685 | 2.31 | 350 | 0.5 | -195.8568 | -165.2519 | -30.6049 | 0.0 | -2238.0549 | -1890.6946 |
69
+ | 0.693 | 2.64 | 400 | 0.5 | -195.8341 | -165.2328 | -30.6013 | 0.0 | -2237.8274 | -1890.5028 |
70
+ | 0.686 | 2.97 | 450 | 0.5 | -195.8235 | -165.2227 | -30.6008 | 0.0 | -2237.7224 | -1890.4027 |
71
+ | 0.6119 | 3.3 | 500 | 0.5 | -195.8122 | -165.2139 | -30.5983 | 0.0 | -2237.6084 | -1890.3141 |
72
+ | 0.5902 | 3.63 | 550 | 0.5 | -195.8078 | -165.2129 | -30.5949 | 0.0 | -2237.5649 | -1890.3043 |
73
+ | 0.7106 | 3.96 | 600 | 0.5 | -196.2488 | -165.5701 | -30.6787 | 0.0 | -2241.9751 | -1893.8765 |
74
+ | 0.8232 | 4.29 | 650 | 0.5 | -196.2429 | -165.5582 | -30.6847 | 0.0 | -2241.9155 | -1893.7571 |
75
+ | 0.5881 | 4.62 | 700 | 0.5 | -197.1647 | -166.3029 | -30.8618 | 0.0 | -2251.1340 | -1901.2047 |
76
+ | 0.6156 | 4.95 | 750 | 0.5 | -197.1416 | -166.2842 | -30.8573 | 0.0 | -2250.9023 | -1901.0179 |
77
+ | 0.6291 | 5.28 | 800 | 0.5 | -197.1509 | -166.2928 | -30.8580 | 0.0 | -2250.9958 | -1901.1036 |
78
+ | 0.6285 | 5.61 | 850 | 0.5 | -197.1602 | -166.2982 | -30.8620 | 0.0 | -2251.0884 | -1901.1571 |
79
+ | 0.6918 | 5.94 | 900 | 0.5 | -197.1623 | -166.3002 | -30.8621 | 0.0 | -2251.1104 | -1901.1774 |
80
+ | 0.7869 | 6.27 | 950 | 0.5 | -197.1630 | -166.3040 | -30.8591 | 0.0 | -2251.1169 | -1901.2148 |
81
+ | 0.5483 | 6.6 | 1000 | 0.5 | -197.1648 | -166.2998 | -30.8650 | 0.0 | -2251.1345 | -1901.1730 |
82
+ | 0.7744 | 6.93 | 1050 | 0.5 | -197.5333 | -166.5969 | -30.9364 | 0.0 | -2254.8201 | -1904.1442 |
83
+ | 0.9077 | 7.26 | 1100 | 0.5 | -197.5402 | -166.6008 | -30.9394 | 0.0 | -2254.8884 | -1904.1827 |
84
+ | 0.664 | 7.59 | 1150 | 0.5 | -197.2621 | -166.3788 | -30.8832 | 0.0 | -2252.1074 | -1901.9637 |
85
+ | 0.6126 | 7.92 | 1200 | 0.5 | -197.2483 | -166.3705 | -30.8778 | 0.0 | -2251.9697 | -1901.8805 |
86
+ | 0.8377 | 8.25 | 1250 | 0.5 | -197.1308 | -166.2760 | -30.8547 | 0.0 | -2250.7944 | -1900.9357 |
87
+ | 0.6109 | 8.58 | 1300 | 0.5 | -197.1868 | -166.3199 | -30.8669 | 0.0 | -2251.3545 | -1901.3741 |
88
+ | 0.7432 | 8.91 | 1350 | 0.5 | -197.2601 | -166.3793 | -30.8808 | 0.0 | -2252.0879 | -1901.9680 |
89
+ | 0.8664 | 9.24 | 1400 | 0.5 | -197.1278 | -166.2694 | -30.8584 | 0.0 | -2250.7642 | -1900.8693 |
90
+ | 0.7237 | 9.57 | 1450 | 0.5 | -197.125 | -166.2689 | -30.8561 | 0.0 | -2250.7366 | -1900.8641 |
91
+
92
+
93
+ ### Framework versions
94
+
95
+ - PEFT 0.10.0
96
+ - Transformers 4.38.2
97
+ - Pytorch 2.1.0+cu118
98
+ - Datasets 2.18.0
99
+ - Tokenizers 0.15.2