Model save
Browse files
README.md
ADDED
@@ -0,0 +1,99 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
library_name: peft
|
4 |
+
tags:
|
5 |
+
- trl
|
6 |
+
- kto
|
7 |
+
- generated_from_trainer
|
8 |
+
base_model: HuggingFaceH4/zephyr-7b-beta
|
9 |
+
model-index:
|
10 |
+
- name: WeniGPT-Agents-Zephyr-1.0.25-KTO
|
11 |
+
results: []
|
12 |
+
---
|
13 |
+
|
14 |
+
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
15 |
+
should probably proofread and complete it, then remove this comment. -->
|
16 |
+
|
17 |
+
# WeniGPT-Agents-Zephyr-1.0.25-KTO
|
18 |
+
|
19 |
+
This model is a fine-tuned version of [HuggingFaceH4/zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) on the None dataset.
|
20 |
+
It achieves the following results on the evaluation set:
|
21 |
+
- Loss: 0.5
|
22 |
+
- Rewards/chosen: -195.8677
|
23 |
+
- Rewards/rejected: -165.2624
|
24 |
+
- Rewards/margins: -30.6053
|
25 |
+
- Kl: 0.0
|
26 |
+
- Logps/chosen: -2238.1643
|
27 |
+
- Logps/rejected: -1890.7997
|
28 |
+
|
29 |
+
## Model description
|
30 |
+
|
31 |
+
More information needed
|
32 |
+
|
33 |
+
## Intended uses & limitations
|
34 |
+
|
35 |
+
More information needed
|
36 |
+
|
37 |
+
## Training and evaluation data
|
38 |
+
|
39 |
+
More information needed
|
40 |
+
|
41 |
+
## Training procedure
|
42 |
+
|
43 |
+
### Training hyperparameters
|
44 |
+
|
45 |
+
The following hyperparameters were used during training:
|
46 |
+
- learning_rate: 0.0002
|
47 |
+
- train_batch_size: 4
|
48 |
+
- eval_batch_size: 4
|
49 |
+
- seed: 42
|
50 |
+
- gradient_accumulation_steps: 4
|
51 |
+
- total_train_batch_size: 16
|
52 |
+
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
53 |
+
- lr_scheduler_type: linear
|
54 |
+
- lr_scheduler_warmup_ratio: 0.03
|
55 |
+
- training_steps: 1470
|
56 |
+
- mixed_precision_training: Native AMP
|
57 |
+
|
58 |
+
### Training results
|
59 |
+
|
60 |
+
| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/margins | Kl | Logps/chosen | Logps/rejected |
|
61 |
+
|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:---------------:|:---:|:------------:|:--------------:|
|
62 |
+
| 0.6966 | 0.33 | 50 | 0.5063 | -13.4129 | -12.4081 | -1.0049 | 0.0 | -413.6161 | -362.2560 |
|
63 |
+
| 0.754 | 0.66 | 100 | 0.5000 | -174.7515 | -145.9646 | -28.7869 | 0.0 | -2027.0018 | -1697.8218 |
|
64 |
+
| 0.6274 | 0.99 | 150 | 0.5 | -195.8329 | -165.1599 | -30.6730 | 0.0 | -2237.8149 | -1889.7742 |
|
65 |
+
| 0.642 | 1.32 | 200 | 0.5000 | -195.1430 | -164.6777 | -30.4653 | 0.0 | -2230.9163 | -1884.9520 |
|
66 |
+
| 0.6241 | 1.65 | 250 | 0.5000 | -195.1471 | -164.6848 | -30.4623 | 0.0 | -2230.9573 | -1885.0226 |
|
67 |
+
| 0.7477 | 1.98 | 300 | 0.5 | -195.8677 | -165.2624 | -30.6053 | 0.0 | -2238.1643 | -1890.7997 |
|
68 |
+
| 0.8685 | 2.31 | 350 | 0.5 | -195.8568 | -165.2519 | -30.6049 | 0.0 | -2238.0549 | -1890.6946 |
|
69 |
+
| 0.693 | 2.64 | 400 | 0.5 | -195.8341 | -165.2328 | -30.6013 | 0.0 | -2237.8274 | -1890.5028 |
|
70 |
+
| 0.686 | 2.97 | 450 | 0.5 | -195.8235 | -165.2227 | -30.6008 | 0.0 | -2237.7224 | -1890.4027 |
|
71 |
+
| 0.6119 | 3.3 | 500 | 0.5 | -195.8122 | -165.2139 | -30.5983 | 0.0 | -2237.6084 | -1890.3141 |
|
72 |
+
| 0.5902 | 3.63 | 550 | 0.5 | -195.8078 | -165.2129 | -30.5949 | 0.0 | -2237.5649 | -1890.3043 |
|
73 |
+
| 0.7106 | 3.96 | 600 | 0.5 | -196.2488 | -165.5701 | -30.6787 | 0.0 | -2241.9751 | -1893.8765 |
|
74 |
+
| 0.8232 | 4.29 | 650 | 0.5 | -196.2429 | -165.5582 | -30.6847 | 0.0 | -2241.9155 | -1893.7571 |
|
75 |
+
| 0.5881 | 4.62 | 700 | 0.5 | -197.1647 | -166.3029 | -30.8618 | 0.0 | -2251.1340 | -1901.2047 |
|
76 |
+
| 0.6156 | 4.95 | 750 | 0.5 | -197.1416 | -166.2842 | -30.8573 | 0.0 | -2250.9023 | -1901.0179 |
|
77 |
+
| 0.6291 | 5.28 | 800 | 0.5 | -197.1509 | -166.2928 | -30.8580 | 0.0 | -2250.9958 | -1901.1036 |
|
78 |
+
| 0.6285 | 5.61 | 850 | 0.5 | -197.1602 | -166.2982 | -30.8620 | 0.0 | -2251.0884 | -1901.1571 |
|
79 |
+
| 0.6918 | 5.94 | 900 | 0.5 | -197.1623 | -166.3002 | -30.8621 | 0.0 | -2251.1104 | -1901.1774 |
|
80 |
+
| 0.7869 | 6.27 | 950 | 0.5 | -197.1630 | -166.3040 | -30.8591 | 0.0 | -2251.1169 | -1901.2148 |
|
81 |
+
| 0.5483 | 6.6 | 1000 | 0.5 | -197.1648 | -166.2998 | -30.8650 | 0.0 | -2251.1345 | -1901.1730 |
|
82 |
+
| 0.7744 | 6.93 | 1050 | 0.5 | -197.5333 | -166.5969 | -30.9364 | 0.0 | -2254.8201 | -1904.1442 |
|
83 |
+
| 0.9077 | 7.26 | 1100 | 0.5 | -197.5402 | -166.6008 | -30.9394 | 0.0 | -2254.8884 | -1904.1827 |
|
84 |
+
| 0.664 | 7.59 | 1150 | 0.5 | -197.2621 | -166.3788 | -30.8832 | 0.0 | -2252.1074 | -1901.9637 |
|
85 |
+
| 0.6126 | 7.92 | 1200 | 0.5 | -197.2483 | -166.3705 | -30.8778 | 0.0 | -2251.9697 | -1901.8805 |
|
86 |
+
| 0.8377 | 8.25 | 1250 | 0.5 | -197.1308 | -166.2760 | -30.8547 | 0.0 | -2250.7944 | -1900.9357 |
|
87 |
+
| 0.6109 | 8.58 | 1300 | 0.5 | -197.1868 | -166.3199 | -30.8669 | 0.0 | -2251.3545 | -1901.3741 |
|
88 |
+
| 0.7432 | 8.91 | 1350 | 0.5 | -197.2601 | -166.3793 | -30.8808 | 0.0 | -2252.0879 | -1901.9680 |
|
89 |
+
| 0.8664 | 9.24 | 1400 | 0.5 | -197.1278 | -166.2694 | -30.8584 | 0.0 | -2250.7642 | -1900.8693 |
|
90 |
+
| 0.7237 | 9.57 | 1450 | 0.5 | -197.125 | -166.2689 | -30.8561 | 0.0 | -2250.7366 | -1900.8641 |
|
91 |
+
|
92 |
+
|
93 |
+
### Framework versions
|
94 |
+
|
95 |
+
- PEFT 0.10.0
|
96 |
+
- Transformers 4.38.2
|
97 |
+
- Pytorch 2.1.0+cu118
|
98 |
+
- Datasets 2.18.0
|
99 |
+
- Tokenizers 0.15.2
|