gemma-2b-coedit / README.md
iliazlobin's picture
complete: train_size: {train_size}, batch_size: {batch_size}, per_epoch_steps: {per_epoch_steps}, epochs: {epochs}, epoch_total_steps: {epoch_total_steps}
6e16d2c verified
metadata
license: gemma
base_model: google/gemma-2b
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: gemma-2b-coedit
    results: []

gemma-2b-coedit

This model is a fine-tuned version of google/gemma-2b on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7456
  • Rouge1: 0.5006
  • Rouge2: 0.3991
  • Rougel: 0.4788
  • Rougelsum: 0.4786
  • Sacreblue: 20.7764
  • Memory Used: 79283.5
  • Cuda Allocated: 9625.1006
  • Cuda Reserved: 73102.0
  • Ram Usage: 10024.6953
  • Em: 0.0
  • Gen Len: 101.5333

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 35
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 140
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1
  • num_epochs: 2
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Sacreblue Memory Used Cuda Allocated Cuda Reserved Ram Usage Em Gen Len
0.5426 0.22 100 0.7076 0.3807 0.297 0.3623 0.3621 18.8513 69159.5 9625.1431 62980.0 5073.7852 0.0 101.5333
0.5051 0.44 200 0.6849 0.4094 0.3207 0.3907 0.3905 21.1175 67317.5 9625.1196 61138.0 5067.1328 0.0 101.5333
0.4909 0.66 300 0.6735 0.4943 0.3926 0.473 0.4729 11.0979 67319.5 9625.1182 61138.0 9820.3711 0.0 101.5333
0.4804 0.88 400 0.6672 0.4995 0.4004 0.4796 0.4795 24.1464 67319.5 9625.1079 61138.0 9803.6172 0.0 101.5333
0.2842 1.1 500 0.7475 0.5011 0.3995 0.4792 0.4792 27.3521 79283.5 9625.0977 73102.0 9845.9766 0.0 101.5333
0.2471 1.32 600 0.7447 0.4908 0.3906 0.4694 0.4693 24.0058 79283.5 9625.1123 73102.0 9916.7539 0.0 101.5333
0.2422 1.54 700 0.7361 0.4967 0.3954 0.4749 0.4749 21.4519 79283.5 9625.1196 73102.0 9910.2695 0.0 101.5333
0.2354 1.76 800 0.7443 0.4882 0.3882 0.467 0.4669 19.4531 79283.5 9625.124 73102.0 10050.582 0.0 101.5333
0.2334 1.98 900 0.7456 0.5006 0.3991 0.4788 0.4786 20.7764 79283.5 9625.1006 73102.0 10024.6953 0.0 101.5333

Framework versions

  • Transformers 4.39.3
  • Pytorch 2.2.2
  • Datasets 2.18.0
  • Tokenizers 0.15.2