long_t5_4 / README.md
zera09's picture
End of training
d60a84f verified
|
raw
history blame
6.52 kB
---
library_name: transformers
license: apache-2.0
base_model: google/long-t5-tglobal-base
tags:
- generated_from_trainer
metrics:
- rouge
model-index:
- name: long_t5_4
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# long_t5_4
This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 3.0847
- Rouge1: 0.5303
- Rouge2: 0.3398
- Rougel: 0.477
- Rougelsum: 0.477
- Gen Len: 31.974
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 50
### Training results
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
|:-------------:|:-----:|:-----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|
| 2.0147 | 1.0 | 1000 | 1.5675 | 0.4907 | 0.3059 | 0.4453 | 0.4454 | 25.7975 |
| 1.7618 | 2.0 | 2000 | 1.5138 | 0.5037 | 0.3169 | 0.4578 | 0.458 | 26.608 |
| 1.5904 | 3.0 | 3000 | 1.5015 | 0.5091 | 0.3239 | 0.4645 | 0.4648 | 25.5405 |
| 1.4555 | 4.0 | 4000 | 1.5083 | 0.5183 | 0.3335 | 0.4727 | 0.4732 | 26.777 |
| 1.3579 | 5.0 | 5000 | 1.5363 | 0.5205 | 0.3353 | 0.4743 | 0.4744 | 27.916 |
| 1.2345 | 6.0 | 6000 | 1.5543 | 0.5193 | 0.338 | 0.4772 | 0.4769 | 25.6475 |
| 1.1663 | 7.0 | 7000 | 1.5570 | 0.5299 | 0.3449 | 0.4837 | 0.4837 | 26.9075 |
| 1.0754 | 8.0 | 8000 | 1.5953 | 0.5289 | 0.3422 | 0.4804 | 0.4804 | 29.1995 |
| 0.9901 | 9.0 | 9000 | 1.6392 | 0.5333 | 0.3443 | 0.483 | 0.4831 | 28.9815 |
| 0.9321 | 10.0 | 10000 | 1.6641 | 0.5269 | 0.3361 | 0.4764 | 0.4765 | 28.8695 |
| 0.87 | 11.0 | 11000 | 1.7062 | 0.5299 | 0.3409 | 0.4793 | 0.4794 | 29.366 |
| 0.8062 | 12.0 | 12000 | 1.7558 | 0.5287 | 0.342 | 0.4794 | 0.4798 | 29.29 |
| 0.7595 | 13.0 | 13000 | 1.8033 | 0.5256 | 0.3402 | 0.4784 | 0.4783 | 29.204 |
| 0.7195 | 14.0 | 14000 | 1.8229 | 0.5293 | 0.3425 | 0.4802 | 0.4803 | 30.156 |
| 0.668 | 15.0 | 15000 | 1.8817 | 0.5288 | 0.3421 | 0.4791 | 0.4792 | 30.1525 |
| 0.6283 | 16.0 | 16000 | 1.9278 | 0.5294 | 0.3404 | 0.478 | 0.4778 | 29.942 |
| 0.5957 | 17.0 | 17000 | 1.9536 | 0.5312 | 0.3416 | 0.4807 | 0.4809 | 29.525 |
| 0.5496 | 18.0 | 18000 | 2.0396 | 0.5309 | 0.3403 | 0.4788 | 0.479 | 30.359 |
| 0.5208 | 19.0 | 19000 | 2.0539 | 0.5312 | 0.3442 | 0.4813 | 0.481 | 30.173 |
| 0.491 | 20.0 | 20000 | 2.0836 | 0.5297 | 0.3395 | 0.4794 | 0.4792 | 29.554 |
| 0.4522 | 21.0 | 21000 | 2.1548 | 0.5282 | 0.3396 | 0.4751 | 0.4753 | 31.565 |
| 0.4339 | 22.0 | 22000 | 2.2076 | 0.5264 | 0.338 | 0.476 | 0.476 | 30.0425 |
| 0.4095 | 23.0 | 23000 | 2.2331 | 0.5258 | 0.3366 | 0.4751 | 0.475 | 31.307 |
| 0.3818 | 24.0 | 24000 | 2.3036 | 0.5275 | 0.3371 | 0.4756 | 0.4753 | 31.8185 |
| 0.362 | 25.0 | 25000 | 2.3462 | 0.529 | 0.3374 | 0.4739 | 0.4741 | 32.9885 |
| 0.3414 | 26.0 | 26000 | 2.3989 | 0.5335 | 0.3444 | 0.482 | 0.4819 | 30.4255 |
| 0.3188 | 27.0 | 27000 | 2.4419 | 0.5257 | 0.3367 | 0.4745 | 0.4744 | 30.6095 |
| 0.2976 | 28.0 | 28000 | 2.4965 | 0.5256 | 0.3336 | 0.4702 | 0.4701 | 33.6375 |
| 0.2896 | 29.0 | 29000 | 2.4841 | 0.5254 | 0.3341 | 0.4725 | 0.4725 | 32.7325 |
| 0.2702 | 30.0 | 30000 | 2.5704 | 0.5298 | 0.3399 | 0.4775 | 0.4778 | 31.307 |
| 0.2583 | 31.0 | 31000 | 2.6376 | 0.5306 | 0.3411 | 0.4773 | 0.4774 | 31.0695 |
| 0.2472 | 32.0 | 32000 | 2.6134 | 0.5266 | 0.3376 | 0.4729 | 0.473 | 32.3075 |
| 0.2361 | 33.0 | 33000 | 2.6922 | 0.5294 | 0.3391 | 0.4763 | 0.4764 | 31.5785 |
| 0.2242 | 34.0 | 34000 | 2.7246 | 0.5292 | 0.3383 | 0.4745 | 0.4747 | 32.823 |
| 0.2173 | 35.0 | 35000 | 2.7647 | 0.5294 | 0.3386 | 0.4754 | 0.4754 | 32.0915 |
| 0.2057 | 36.0 | 36000 | 2.7717 | 0.5297 | 0.343 | 0.4781 | 0.4781 | 32.132 |
| 0.1957 | 37.0 | 37000 | 2.8077 | 0.5257 | 0.3372 | 0.4729 | 0.4728 | 32.147 |
| 0.1895 | 38.0 | 38000 | 2.8661 | 0.5268 | 0.3375 | 0.4733 | 0.4734 | 32.156 |
| 0.1818 | 39.0 | 39000 | 2.8841 | 0.5272 | 0.3388 | 0.4747 | 0.475 | 31.3275 |
| 0.1749 | 40.0 | 40000 | 2.9060 | 0.5278 | 0.3395 | 0.4752 | 0.4751 | 31.835 |
| 0.1705 | 41.0 | 41000 | 2.9260 | 0.5262 | 0.3365 | 0.4729 | 0.4732 | 32.3635 |
| 0.163 | 42.0 | 42000 | 2.9924 | 0.5284 | 0.3383 | 0.4754 | 0.4754 | 31.4935 |
| 0.163 | 43.0 | 43000 | 2.9798 | 0.5299 | 0.3403 | 0.4762 | 0.4765 | 31.8165 |
| 0.1583 | 44.0 | 44000 | 2.9919 | 0.5291 | 0.3397 | 0.4755 | 0.4759 | 31.6065 |
| 0.1537 | 45.0 | 45000 | 3.0308 | 0.5281 | 0.3381 | 0.4748 | 0.4749 | 31.447 |
| 0.1493 | 46.0 | 46000 | 3.0491 | 0.5287 | 0.339 | 0.4753 | 0.4755 | 31.944 |
| 0.1437 | 47.0 | 47000 | 3.0595 | 0.5282 | 0.3383 | 0.4744 | 0.4746 | 31.833 |
| 0.1437 | 48.0 | 48000 | 3.0804 | 0.5307 | 0.3401 | 0.477 | 0.4771 | 31.837 |
| 0.1435 | 49.0 | 49000 | 3.0782 | 0.5312 | 0.3406 | 0.4772 | 0.4772 | 31.798 |
| 0.1392 | 50.0 | 50000 | 3.0847 | 0.5303 | 0.3398 | 0.477 | 0.477 | 31.974 |
### Framework versions
- Transformers 4.45.1
- Pytorch 2.2.1
- Datasets 3.0.1
- Tokenizers 0.20.0