GPT2_fine_tuned / README.md
MatanP's picture
End of training
f92eb53 verified
metadata
base_model: gpt2
datasets:
  - generator
library_name: peft
license: mit
tags:
  - trl
  - sft
  - generated_from_trainer
model-index:
  - name: SFT_FineTuned_GPT2
    results: []

SFT_FineTuned_GPT2

This model is a fine-tuned version of gpt2 on the MatanP/stories-with_custom_prompts dataset. It achieves the following results on the evaluation set:

  • Loss: 2.9966

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 4
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss
3.1202 0.9987 189 3.1335
3.2747 1.9974 378 3.1001
3.182 2.9960 567 3.0758
3.1566 4.0 757 3.0586
2.9957 4.9987 946 3.0462
2.9198 5.9974 1135 3.0371
3.0283 6.9960 1324 3.0299
3.0743 8.0 1514 3.0237
3.0079 8.9987 1703 3.0202
3.2004 9.9974 1892 3.0177
3.1039 10.9960 2081 3.0122
2.93 12.0 2271 3.0106
2.9261 12.9987 2460 3.0073
2.9913 13.9974 2649 3.0065
3.0955 14.9960 2838 3.0044
2.9342 16.0 3028 3.0018
2.9462 16.9987 3217 3.0013
3.1342 17.9974 3406 3.0005
2.6799 18.9960 3595 3.0007
2.9402 20.0 3785 2.9995
3.0764 20.9987 3974 2.9992
2.9976 21.9974 4163 2.9977
3.0089 22.9960 4352 2.9988
2.905 24.0 4542 2.9977
3.0469 24.9987 4731 2.9977
3.2238 25.9974 4920 2.9975
3.2074 26.9960 5109 2.9960
2.7844 28.0 5299 2.9965
2.9329 28.9987 5488 2.9975
2.8714 29.9604 5670 2.9966

Framework versions

  • PEFT 0.12.0
  • Transformers 4.44.2
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.0
  • Tokenizers 0.19.1