Weni
/

WeniGPT-Agents-Zephyr-1.0.25-KTO

+---
+license: mit
+library_name: peft
+tags:
+- trl
+- kto
+- generated_from_trainer
+base_model: HuggingFaceH4/zephyr-7b-beta
+model-index:
+- name: WeniGPT-Agents-Zephyr-1.0.25-KTO
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# WeniGPT-Agents-Zephyr-1.0.25-KTO
+This model is a fine-tuned version of [HuggingFaceH4/zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.5
+- Rewards/chosen: -195.8677
+- Rewards/rejected: -165.2624
+- Rewards/margins: -30.6053
+- Kl: 0.0
+- Logps/chosen: -2238.1643
+- Logps/rejected: -1890.7997
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0002
+- train_batch_size: 4
+- eval_batch_size: 4
+- seed: 42
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 16
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_ratio: 0.03
+- training_steps: 1470
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/margins | Kl  | Logps/chosen | Logps/rejected |
+|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:---------------:|:---:|:------------:|:--------------:|
+| 0.6966        | 0.33  | 50   | 0.5063          | -13.4129       | -12.4081         | -1.0049         | 0.0 | -413.6161    | -362.2560      |
+| 0.754         | 0.66  | 100  | 0.5000          | -174.7515      | -145.9646        | -28.7869        | 0.0 | -2027.0018   | -1697.8218     |
+| 0.6274        | 0.99  | 150  | 0.5             | -195.8329      | -165.1599        | -30.6730        | 0.0 | -2237.8149   | -1889.7742     |
+| 0.642         | 1.32  | 200  | 0.5000          | -195.1430      | -164.6777        | -30.4653        | 0.0 | -2230.9163   | -1884.9520     |
+| 0.6241        | 1.65  | 250  | 0.5000          | -195.1471      | -164.6848        | -30.4623        | 0.0 | -2230.9573   | -1885.0226     |
+| 0.7477        | 1.98  | 300  | 0.5             | -195.8677      | -165.2624        | -30.6053        | 0.0 | -2238.1643   | -1890.7997     |
+| 0.8685        | 2.31  | 350  | 0.5             | -195.8568      | -165.2519        | -30.6049        | 0.0 | -2238.0549   | -1890.6946     |
+| 0.693         | 2.64  | 400  | 0.5             | -195.8341      | -165.2328        | -30.6013        | 0.0 | -2237.8274   | -1890.5028     |
+| 0.686         | 2.97  | 450  | 0.5             | -195.8235      | -165.2227        | -30.6008        | 0.0 | -2237.7224   | -1890.4027     |
+| 0.6119        | 3.3   | 500  | 0.5             | -195.8122      | -165.2139        | -30.5983        | 0.0 | -2237.6084   | -1890.3141     |
+| 0.5902        | 3.63  | 550  | 0.5             | -195.8078      | -165.2129        | -30.5949        | 0.0 | -2237.5649   | -1890.3043     |
+| 0.7106        | 3.96  | 600  | 0.5             | -196.2488      | -165.5701        | -30.6787        | 0.0 | -2241.9751   | -1893.8765     |
+| 0.8232        | 4.29  | 650  | 0.5             | -196.2429      | -165.5582        | -30.6847        | 0.0 | -2241.9155   | -1893.7571     |
+| 0.5881        | 4.62  | 700  | 0.5             | -197.1647      | -166.3029        | -30.8618        | 0.0 | -2251.1340   | -1901.2047     |
+| 0.6156        | 4.95  | 750  | 0.5             | -197.1416      | -166.2842        | -30.8573        | 0.0 | -2250.9023   | -1901.0179     |
+| 0.6291        | 5.28  | 800  | 0.5             | -197.1509      | -166.2928        | -30.8580        | 0.0 | -2250.9958   | -1901.1036     |
+| 0.6285        | 5.61  | 850  | 0.5             | -197.1602      | -166.2982        | -30.8620        | 0.0 | -2251.0884   | -1901.1571     |
+| 0.6918        | 5.94  | 900  | 0.5             | -197.1623      | -166.3002        | -30.8621        | 0.0 | -2251.1104   | -1901.1774     |
+| 0.7869        | 6.27  | 950  | 0.5             | -197.1630      | -166.3040        | -30.8591        | 0.0 | -2251.1169   | -1901.2148     |
+| 0.5483        | 6.6   | 1000 | 0.5             | -197.1648      | -166.2998        | -30.8650        | 0.0 | -2251.1345   | -1901.1730     |
+| 0.7744        | 6.93  | 1050 | 0.5             | -197.5333      | -166.5969        | -30.9364        | 0.0 | -2254.8201   | -1904.1442     |
+| 0.9077        | 7.26  | 1100 | 0.5             | -197.5402      | -166.6008        | -30.9394        | 0.0 | -2254.8884   | -1904.1827     |
+| 0.664         | 7.59  | 1150 | 0.5             | -197.2621      | -166.3788        | -30.8832        | 0.0 | -2252.1074   | -1901.9637     |
+| 0.6126        | 7.92  | 1200 | 0.5             | -197.2483      | -166.3705        | -30.8778        | 0.0 | -2251.9697   | -1901.8805     |
+| 0.8377        | 8.25  | 1250 | 0.5             | -197.1308      | -166.2760        | -30.8547        | 0.0 | -2250.7944   | -1900.9357     |
+| 0.6109        | 8.58  | 1300 | 0.5             | -197.1868      | -166.3199        | -30.8669        | 0.0 | -2251.3545   | -1901.3741     |
+| 0.7432        | 8.91  | 1350 | 0.5             | -197.2601      | -166.3793        | -30.8808        | 0.0 | -2252.0879   | -1901.9680     |
+| 0.8664        | 9.24  | 1400 | 0.5             | -197.1278      | -166.2694        | -30.8584        | 0.0 | -2250.7642   | -1900.8693     |
+| 0.7237        | 9.57  | 1450 | 0.5             | -197.125       | -166.2689        | -30.8561        | 0.0 | -2250.7366   | -1900.8641     |
+### Framework versions
+- PEFT 0.10.0
+- Transformers 4.38.2
+- Pytorch 2.1.0+cu118
+- Datasets 2.18.0
+- Tokenizers 0.15.2