--- base_model: gpt2 datasets: - generator library_name: peft license: mit tags: - trl - sft - generated_from_trainer model-index: - name: SFT_FineTuned_GPT2 results: [] --- # SFT_FineTuned_GPT2 This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on the MatanP/stories-with_custom_prompts dataset. It achieves the following results on the evaluation set: - Loss: 2.9966 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1e-05 - train_batch_size: 1 - eval_batch_size: 1 - seed: 42 - gradient_accumulation_steps: 4 - total_train_batch_size: 4 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: cosine - num_epochs: 30 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-------:|:----:|:---------------:| | 3.1202 | 0.9987 | 189 | 3.1335 | | 3.2747 | 1.9974 | 378 | 3.1001 | | 3.182 | 2.9960 | 567 | 3.0758 | | 3.1566 | 4.0 | 757 | 3.0586 | | 2.9957 | 4.9987 | 946 | 3.0462 | | 2.9198 | 5.9974 | 1135 | 3.0371 | | 3.0283 | 6.9960 | 1324 | 3.0299 | | 3.0743 | 8.0 | 1514 | 3.0237 | | 3.0079 | 8.9987 | 1703 | 3.0202 | | 3.2004 | 9.9974 | 1892 | 3.0177 | | 3.1039 | 10.9960 | 2081 | 3.0122 | | 2.93 | 12.0 | 2271 | 3.0106 | | 2.9261 | 12.9987 | 2460 | 3.0073 | | 2.9913 | 13.9974 | 2649 | 3.0065 | | 3.0955 | 14.9960 | 2838 | 3.0044 | | 2.9342 | 16.0 | 3028 | 3.0018 | | 2.9462 | 16.9987 | 3217 | 3.0013 | | 3.1342 | 17.9974 | 3406 | 3.0005 | | 2.6799 | 18.9960 | 3595 | 3.0007 | | 2.9402 | 20.0 | 3785 | 2.9995 | | 3.0764 | 20.9987 | 3974 | 2.9992 | | 2.9976 | 21.9974 | 4163 | 2.9977 | | 3.0089 | 22.9960 | 4352 | 2.9988 | | 2.905 | 24.0 | 4542 | 2.9977 | | 3.0469 | 24.9987 | 4731 | 2.9977 | | 3.2238 | 25.9974 | 4920 | 2.9975 | | 3.2074 | 26.9960 | 5109 | 2.9960 | | 2.7844 | 28.0 | 5299 | 2.9965 | | 2.9329 | 28.9987 | 5488 | 2.9975 | | 2.8714 | 29.9604 | 5670 | 2.9966 | ### Framework versions - PEFT 0.12.0 - Transformers 4.44.2 - Pytorch 2.4.1+cu121 - Datasets 3.0.0 - Tokenizers 0.19.1