Model save
Browse files- README.md +95 -0
- adapter_model.safetensors +1 -1
README.md
ADDED
@@ -0,0 +1,95 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
library_name: peft
|
3 |
+
tags:
|
4 |
+
- trl
|
5 |
+
- dpo
|
6 |
+
- generated_from_trainer
|
7 |
+
base_model: Weni/WeniGPT-Agents-Mistral-1.0.6-SFT-merged
|
8 |
+
model-index:
|
9 |
+
- name: WeniGPT-Agents-Mistral-1.0.6-SFT-1.0.4-DPO
|
10 |
+
results: []
|
11 |
+
---
|
12 |
+
|
13 |
+
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
14 |
+
should probably proofread and complete it, then remove this comment. -->
|
15 |
+
|
16 |
+
# WeniGPT-Agents-Mistral-1.0.6-SFT-1.0.4-DPO
|
17 |
+
|
18 |
+
This model is a fine-tuned version of [Weni/WeniGPT-Agents-Mistral-1.0.6-SFT-merged](https://huggingface.co/Weni/WeniGPT-Agents-Mistral-1.0.6-SFT-merged) on an unknown dataset.
|
19 |
+
It achieves the following results on the evaluation set:
|
20 |
+
- Loss: 0.4296
|
21 |
+
- Rewards/chosen: 2.1700
|
22 |
+
- Rewards/rejected: -0.6894
|
23 |
+
- Rewards/accuracies: 0.4286
|
24 |
+
- Rewards/margins: 2.8595
|
25 |
+
- Logps/rejected: -98.0954
|
26 |
+
- Logps/chosen: -47.9682
|
27 |
+
- Logits/rejected: -1.8433
|
28 |
+
- Logits/chosen: -1.8191
|
29 |
+
|
30 |
+
## Model description
|
31 |
+
|
32 |
+
More information needed
|
33 |
+
|
34 |
+
## Intended uses & limitations
|
35 |
+
|
36 |
+
More information needed
|
37 |
+
|
38 |
+
## Training and evaluation data
|
39 |
+
|
40 |
+
More information needed
|
41 |
+
|
42 |
+
## Training procedure
|
43 |
+
|
44 |
+
### Training hyperparameters
|
45 |
+
|
46 |
+
The following hyperparameters were used during training:
|
47 |
+
- learning_rate: 5e-06
|
48 |
+
- train_batch_size: 2
|
49 |
+
- eval_batch_size: 2
|
50 |
+
- seed: 42
|
51 |
+
- gradient_accumulation_steps: 2
|
52 |
+
- total_train_batch_size: 4
|
53 |
+
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
54 |
+
- lr_scheduler_type: linear
|
55 |
+
- lr_scheduler_warmup_ratio: 0.03
|
56 |
+
- training_steps: 732
|
57 |
+
- mixed_precision_training: Native AMP
|
58 |
+
|
59 |
+
### Training results
|
60 |
+
|
61 |
+
| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|
62 |
+
|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
|
63 |
+
| 0.665 | 0.49 | 30 | 0.6322 | 0.1457 | 0.0118 | 0.4286 | 0.1339 | -95.7578 | -54.7160 | -1.7981 | -1.7801 |
|
64 |
+
| 0.5445 | 0.98 | 60 | 0.5272 | 0.4942 | 0.0186 | 0.4286 | 0.4756 | -95.7351 | -53.5543 | -1.8031 | -1.7844 |
|
65 |
+
| 0.4616 | 1.46 | 90 | 0.4622 | 0.9572 | -0.0104 | 0.4286 | 0.9676 | -95.8320 | -52.0111 | -1.8099 | -1.7902 |
|
66 |
+
| 0.5327 | 1.95 | 120 | 0.4424 | 1.3376 | -0.0803 | 0.4286 | 1.4179 | -96.0649 | -50.7429 | -1.8171 | -1.7963 |
|
67 |
+
| 0.5459 | 2.44 | 150 | 0.4335 | 1.6435 | -0.1846 | 0.4286 | 1.8281 | -96.4125 | -49.7233 | -1.8243 | -1.8025 |
|
68 |
+
| 0.4055 | 2.93 | 180 | 0.4326 | 1.8624 | -0.3390 | 0.4286 | 2.2014 | -96.9273 | -48.9936 | -1.8301 | -1.8074 |
|
69 |
+
| 0.4694 | 3.41 | 210 | 0.4311 | 1.9971 | -0.4435 | 0.4286 | 2.4406 | -97.2756 | -48.5445 | -1.8368 | -1.8136 |
|
70 |
+
| 0.5431 | 3.9 | 240 | 0.4247 | 2.0881 | -0.5490 | 0.4286 | 2.6371 | -97.6273 | -48.2414 | -1.8401 | -1.8164 |
|
71 |
+
| 0.4547 | 4.39 | 270 | 0.4296 | 2.1700 | -0.6894 | 0.4286 | 2.8595 | -98.0954 | -47.9682 | -1.8433 | -1.8191 |
|
72 |
+
| 0.3606 | 4.88 | 300 | 0.4290 | 2.2236 | -0.7919 | 0.4286 | 3.0155 | -98.4369 | -47.7897 | -1.8460 | -1.8213 |
|
73 |
+
| 0.4021 | 5.37 | 330 | 0.4302 | 2.2553 | -0.9243 | 0.4286 | 3.1797 | -98.8783 | -47.6839 | -1.8471 | -1.8219 |
|
74 |
+
| 0.419 | 5.85 | 360 | 0.4336 | 2.2579 | -1.0063 | 0.4286 | 3.2642 | -99.1514 | -47.6751 | -1.8470 | -1.8214 |
|
75 |
+
| 0.3984 | 6.34 | 390 | 0.4291 | 2.2716 | -1.0712 | 0.4286 | 3.3428 | -99.3678 | -47.6296 | -1.8499 | -1.8243 |
|
76 |
+
| 0.435 | 6.83 | 420 | 0.4285 | 2.2724 | -1.1240 | 0.4286 | 3.3965 | -99.5441 | -47.6268 | -1.8495 | -1.8236 |
|
77 |
+
| 0.5148 | 7.32 | 450 | 0.4309 | 2.2693 | -1.2113 | 0.4286 | 3.4806 | -99.8349 | -47.6373 | -1.8482 | -1.8220 |
|
78 |
+
| 0.412 | 7.8 | 480 | 0.4308 | 2.2647 | -1.2626 | 0.4286 | 3.5273 | -100.0060 | -47.6527 | -1.8481 | -1.8217 |
|
79 |
+
| 0.4911 | 8.29 | 510 | 0.4331 | 2.2554 | -1.3097 | 0.4286 | 3.5651 | -100.1629 | -47.6835 | -1.8466 | -1.8200 |
|
80 |
+
| 0.4433 | 8.78 | 540 | 0.4317 | 2.2453 | -1.3403 | 0.4286 | 3.5855 | -100.2648 | -47.7174 | -1.8468 | -1.8201 |
|
81 |
+
| 0.3813 | 9.27 | 570 | 0.4338 | 2.2396 | -1.3854 | 0.4286 | 3.6250 | -100.4154 | -47.7365 | -1.8473 | -1.8205 |
|
82 |
+
| 0.5026 | 9.76 | 600 | 0.4333 | 2.2386 | -1.4022 | 0.4286 | 3.6408 | -100.4712 | -47.7397 | -1.8472 | -1.8203 |
|
83 |
+
| 0.3121 | 10.24 | 630 | 0.4324 | 2.2339 | -1.4158 | 0.4286 | 3.6497 | -100.5166 | -47.7553 | -1.8462 | -1.8193 |
|
84 |
+
| 0.4165 | 10.73 | 660 | 0.4319 | 2.2307 | -1.4318 | 0.4286 | 3.6625 | -100.5699 | -47.7659 | -1.8462 | -1.8193 |
|
85 |
+
| 0.5328 | 11.22 | 690 | 0.4329 | 2.2254 | -1.4478 | 0.4286 | 3.6732 | -100.6233 | -47.7837 | -1.8457 | -1.8186 |
|
86 |
+
| 0.4046 | 11.71 | 720 | 0.4335 | 2.2229 | -1.4565 | 0.4286 | 3.6793 | -100.6521 | -47.7921 | -1.8454 | -1.8183 |
|
87 |
+
|
88 |
+
|
89 |
+
### Framework versions
|
90 |
+
|
91 |
+
- PEFT 0.10.0
|
92 |
+
- Transformers 4.38.2
|
93 |
+
- Pytorch 2.1.0+cu118
|
94 |
+
- Datasets 2.18.0
|
95 |
+
- Tokenizers 0.15.2
|
adapter_model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 13648432
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4fc4205bc93d6795cf17821f662a891c4d28c9b793aea2722bad767836694c35
|
3 |
size 13648432
|