--- license: gemma base_model: google/gemma-2b tags: - generated_from_trainer metrics: - rouge model-index: - name: gemma-2b-coedit results: [] --- # gemma-2b-coedit This model is a fine-tuned version of [google/gemma-2b](https://huggingface.co/google/gemma-2b) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 0.7456 - Rouge1: 0.5006 - Rouge2: 0.3991 - Rougel: 0.4788 - Rougelsum: 0.4786 - Sacreblue: 20.7764 - Memory Used: 79283.5 - Cuda Allocated: 9625.1006 - Cuda Reserved: 73102.0 - Ram Usage: 10024.6953 - Em: 0.0 - Gen Len: 101.5333 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 35 - eval_batch_size: 8 - seed: 42 - gradient_accumulation_steps: 4 - total_train_batch_size: 140 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 1 - num_epochs: 2 - mixed_precision_training: Native AMP ### Training results | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Sacreblue | Memory Used | Cuda Allocated | Cuda Reserved | Ram Usage | Em | Gen Len | |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:---------:|:-----------:|:--------------:|:-------------:|:----------:|:---:|:--------:| | 0.5426 | 0.22 | 100 | 0.7076 | 0.3807 | 0.297 | 0.3623 | 0.3621 | 18.8513 | 69159.5 | 9625.1431 | 62980.0 | 5073.7852 | 0.0 | 101.5333 | | 0.5051 | 0.44 | 200 | 0.6849 | 0.4094 | 0.3207 | 0.3907 | 0.3905 | 21.1175 | 67317.5 | 9625.1196 | 61138.0 | 5067.1328 | 0.0 | 101.5333 | | 0.4909 | 0.66 | 300 | 0.6735 | 0.4943 | 0.3926 | 0.473 | 0.4729 | 11.0979 | 67319.5 | 9625.1182 | 61138.0 | 9820.3711 | 0.0 | 101.5333 | | 0.4804 | 0.88 | 400 | 0.6672 | 0.4995 | 0.4004 | 0.4796 | 0.4795 | 24.1464 | 67319.5 | 9625.1079 | 61138.0 | 9803.6172 | 0.0 | 101.5333 | | 0.2842 | 1.1 | 500 | 0.7475 | 0.5011 | 0.3995 | 0.4792 | 0.4792 | 27.3521 | 79283.5 | 9625.0977 | 73102.0 | 9845.9766 | 0.0 | 101.5333 | | 0.2471 | 1.32 | 600 | 0.7447 | 0.4908 | 0.3906 | 0.4694 | 0.4693 | 24.0058 | 79283.5 | 9625.1123 | 73102.0 | 9916.7539 | 0.0 | 101.5333 | | 0.2422 | 1.54 | 700 | 0.7361 | 0.4967 | 0.3954 | 0.4749 | 0.4749 | 21.4519 | 79283.5 | 9625.1196 | 73102.0 | 9910.2695 | 0.0 | 101.5333 | | 0.2354 | 1.76 | 800 | 0.7443 | 0.4882 | 0.3882 | 0.467 | 0.4669 | 19.4531 | 79283.5 | 9625.124 | 73102.0 | 10050.582 | 0.0 | 101.5333 | | 0.2334 | 1.98 | 900 | 0.7456 | 0.5006 | 0.3991 | 0.4788 | 0.4786 | 20.7764 | 79283.5 | 9625.1006 | 73102.0 | 10024.6953 | 0.0 | 101.5333 | ### Framework versions - Transformers 4.39.3 - Pytorch 2.2.2 - Datasets 2.18.0 - Tokenizers 0.15.2