---
base_model: gpt2
datasets:
- generator
library_name: peft
license: mit
tags:
- trl
- sft
- generated_from_trainer
model-index:
- name: SFT_FineTuned_GPT2
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# SFT_FineTuned_GPT2

This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on the MatanP/stories-with_custom_prompts dataset.
It achieves the following results on the evaluation set:
- Loss: 2.9966

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 4
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- num_epochs: 30

### Training results

| Training Loss | Epoch   | Step | Validation Loss |
|:-------------:|:-------:|:----:|:---------------:|
| 3.1202        | 0.9987  | 189  | 3.1335          |
| 3.2747        | 1.9974  | 378  | 3.1001          |
| 3.182         | 2.9960  | 567  | 3.0758          |
| 3.1566        | 4.0     | 757  | 3.0586          |
| 2.9957        | 4.9987  | 946  | 3.0462          |
| 2.9198        | 5.9974  | 1135 | 3.0371          |
| 3.0283        | 6.9960  | 1324 | 3.0299          |
| 3.0743        | 8.0     | 1514 | 3.0237          |
| 3.0079        | 8.9987  | 1703 | 3.0202          |
| 3.2004        | 9.9974  | 1892 | 3.0177          |
| 3.1039        | 10.9960 | 2081 | 3.0122          |
| 2.93          | 12.0    | 2271 | 3.0106          |
| 2.9261        | 12.9987 | 2460 | 3.0073          |
| 2.9913        | 13.9974 | 2649 | 3.0065          |
| 3.0955        | 14.9960 | 2838 | 3.0044          |
| 2.9342        | 16.0    | 3028 | 3.0018          |
| 2.9462        | 16.9987 | 3217 | 3.0013          |
| 3.1342        | 17.9974 | 3406 | 3.0005          |
| 2.6799        | 18.9960 | 3595 | 3.0007          |
| 2.9402        | 20.0    | 3785 | 2.9995          |
| 3.0764        | 20.9987 | 3974 | 2.9992          |
| 2.9976        | 21.9974 | 4163 | 2.9977          |
| 3.0089        | 22.9960 | 4352 | 2.9988          |
| 2.905         | 24.0    | 4542 | 2.9977          |
| 3.0469        | 24.9987 | 4731 | 2.9977          |
| 3.2238        | 25.9974 | 4920 | 2.9975          |
| 3.2074        | 26.9960 | 5109 | 2.9960          |
| 2.7844        | 28.0    | 5299 | 2.9965          |
| 2.9329        | 28.9987 | 5488 | 2.9975          |
| 2.8714        | 29.9604 | 5670 | 2.9966          |


### Framework versions

- PEFT 0.12.0
- Transformers 4.44.2
- Pytorch 2.4.1+cu121
- Datasets 3.0.0
- Tokenizers 0.19.1