metadata

base_model: gpt2
datasets:
  - generator
library_name: peft
license: mit
tags:
  - trl
  - sft
  - generated_from_trainer
model-index:
  - name: SFT_FineTuned_GPT2
    results: []

SFT_FineTuned_GPT2

This model is a fine-tuned version of gpt2 on the MatanP/stories-with_custom_prompts dataset. It achieves the following results on the evaluation set:

Loss: 2.9966

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 1
eval_batch_size: 1
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 4
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
num_epochs: 30

Training results

Training Loss	Epoch	Step	Validation Loss
3.1202	0.9987	189	3.1335
3.2747	1.9974	378	3.1001
3.182	2.9960	567	3.0758
3.1566	4.0	757	3.0586
2.9957	4.9987	946	3.0462
2.9198	5.9974	1135	3.0371
3.0283	6.9960	1324	3.0299
3.0743	8.0	1514	3.0237
3.0079	8.9987	1703	3.0202
3.2004	9.9974	1892	3.0177
3.1039	10.9960	2081	3.0122
2.93	12.0	2271	3.0106
2.9261	12.9987	2460	3.0073
2.9913	13.9974	2649	3.0065
3.0955	14.9960	2838	3.0044
2.9342	16.0	3028	3.0018
2.9462	16.9987	3217	3.0013
3.1342	17.9974	3406	3.0005
2.6799	18.9960	3595	3.0007
2.9402	20.0	3785	2.9995
3.0764	20.9987	3974	2.9992
2.9976	21.9974	4163	2.9977
3.0089	22.9960	4352	2.9988
2.905	24.0	4542	2.9977
3.0469	24.9987	4731	2.9977
3.2238	25.9974	4920	2.9975
3.2074	26.9960	5109	2.9960
2.7844	28.0	5299	2.9965
2.9329	28.9987	5488	2.9975
2.8714	29.9604	5670	2.9966

Framework versions

PEFT 0.12.0
Transformers 4.44.2
Pytorch 2.4.1+cu121
Datasets 3.0.0
Tokenizers 0.19.1