File size: 6,523 Bytes
d60a84f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
---
library_name: transformers
license: apache-2.0
base_model: google/long-t5-tglobal-base
tags:
- generated_from_trainer
metrics:
- rouge
model-index:
- name: long_t5_4
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# long_t5_4

This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 3.0847
- Rouge1: 0.5303
- Rouge2: 0.3398
- Rougel: 0.477
- Rougelsum: 0.477
- Gen Len: 31.974

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 50

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
|:-------------:|:-----:|:-----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|
| 2.0147        | 1.0   | 1000  | 1.5675          | 0.4907 | 0.3059 | 0.4453 | 0.4454    | 25.7975 |
| 1.7618        | 2.0   | 2000  | 1.5138          | 0.5037 | 0.3169 | 0.4578 | 0.458     | 26.608  |
| 1.5904        | 3.0   | 3000  | 1.5015          | 0.5091 | 0.3239 | 0.4645 | 0.4648    | 25.5405 |
| 1.4555        | 4.0   | 4000  | 1.5083          | 0.5183 | 0.3335 | 0.4727 | 0.4732    | 26.777  |
| 1.3579        | 5.0   | 5000  | 1.5363          | 0.5205 | 0.3353 | 0.4743 | 0.4744    | 27.916  |
| 1.2345        | 6.0   | 6000  | 1.5543          | 0.5193 | 0.338  | 0.4772 | 0.4769    | 25.6475 |
| 1.1663        | 7.0   | 7000  | 1.5570          | 0.5299 | 0.3449 | 0.4837 | 0.4837    | 26.9075 |
| 1.0754        | 8.0   | 8000  | 1.5953          | 0.5289 | 0.3422 | 0.4804 | 0.4804    | 29.1995 |
| 0.9901        | 9.0   | 9000  | 1.6392          | 0.5333 | 0.3443 | 0.483  | 0.4831    | 28.9815 |
| 0.9321        | 10.0  | 10000 | 1.6641          | 0.5269 | 0.3361 | 0.4764 | 0.4765    | 28.8695 |
| 0.87          | 11.0  | 11000 | 1.7062          | 0.5299 | 0.3409 | 0.4793 | 0.4794    | 29.366  |
| 0.8062        | 12.0  | 12000 | 1.7558          | 0.5287 | 0.342  | 0.4794 | 0.4798    | 29.29   |
| 0.7595        | 13.0  | 13000 | 1.8033          | 0.5256 | 0.3402 | 0.4784 | 0.4783    | 29.204  |
| 0.7195        | 14.0  | 14000 | 1.8229          | 0.5293 | 0.3425 | 0.4802 | 0.4803    | 30.156  |
| 0.668         | 15.0  | 15000 | 1.8817          | 0.5288 | 0.3421 | 0.4791 | 0.4792    | 30.1525 |
| 0.6283        | 16.0  | 16000 | 1.9278          | 0.5294 | 0.3404 | 0.478  | 0.4778    | 29.942  |
| 0.5957        | 17.0  | 17000 | 1.9536          | 0.5312 | 0.3416 | 0.4807 | 0.4809    | 29.525  |
| 0.5496        | 18.0  | 18000 | 2.0396          | 0.5309 | 0.3403 | 0.4788 | 0.479     | 30.359  |
| 0.5208        | 19.0  | 19000 | 2.0539          | 0.5312 | 0.3442 | 0.4813 | 0.481     | 30.173  |
| 0.491         | 20.0  | 20000 | 2.0836          | 0.5297 | 0.3395 | 0.4794 | 0.4792    | 29.554  |
| 0.4522        | 21.0  | 21000 | 2.1548          | 0.5282 | 0.3396 | 0.4751 | 0.4753    | 31.565  |
| 0.4339        | 22.0  | 22000 | 2.2076          | 0.5264 | 0.338  | 0.476  | 0.476     | 30.0425 |
| 0.4095        | 23.0  | 23000 | 2.2331          | 0.5258 | 0.3366 | 0.4751 | 0.475     | 31.307  |
| 0.3818        | 24.0  | 24000 | 2.3036          | 0.5275 | 0.3371 | 0.4756 | 0.4753    | 31.8185 |
| 0.362         | 25.0  | 25000 | 2.3462          | 0.529  | 0.3374 | 0.4739 | 0.4741    | 32.9885 |
| 0.3414        | 26.0  | 26000 | 2.3989          | 0.5335 | 0.3444 | 0.482  | 0.4819    | 30.4255 |
| 0.3188        | 27.0  | 27000 | 2.4419          | 0.5257 | 0.3367 | 0.4745 | 0.4744    | 30.6095 |
| 0.2976        | 28.0  | 28000 | 2.4965          | 0.5256 | 0.3336 | 0.4702 | 0.4701    | 33.6375 |
| 0.2896        | 29.0  | 29000 | 2.4841          | 0.5254 | 0.3341 | 0.4725 | 0.4725    | 32.7325 |
| 0.2702        | 30.0  | 30000 | 2.5704          | 0.5298 | 0.3399 | 0.4775 | 0.4778    | 31.307  |
| 0.2583        | 31.0  | 31000 | 2.6376          | 0.5306 | 0.3411 | 0.4773 | 0.4774    | 31.0695 |
| 0.2472        | 32.0  | 32000 | 2.6134          | 0.5266 | 0.3376 | 0.4729 | 0.473     | 32.3075 |
| 0.2361        | 33.0  | 33000 | 2.6922          | 0.5294 | 0.3391 | 0.4763 | 0.4764    | 31.5785 |
| 0.2242        | 34.0  | 34000 | 2.7246          | 0.5292 | 0.3383 | 0.4745 | 0.4747    | 32.823  |
| 0.2173        | 35.0  | 35000 | 2.7647          | 0.5294 | 0.3386 | 0.4754 | 0.4754    | 32.0915 |
| 0.2057        | 36.0  | 36000 | 2.7717          | 0.5297 | 0.343  | 0.4781 | 0.4781    | 32.132  |
| 0.1957        | 37.0  | 37000 | 2.8077          | 0.5257 | 0.3372 | 0.4729 | 0.4728    | 32.147  |
| 0.1895        | 38.0  | 38000 | 2.8661          | 0.5268 | 0.3375 | 0.4733 | 0.4734    | 32.156  |
| 0.1818        | 39.0  | 39000 | 2.8841          | 0.5272 | 0.3388 | 0.4747 | 0.475     | 31.3275 |
| 0.1749        | 40.0  | 40000 | 2.9060          | 0.5278 | 0.3395 | 0.4752 | 0.4751    | 31.835  |
| 0.1705        | 41.0  | 41000 | 2.9260          | 0.5262 | 0.3365 | 0.4729 | 0.4732    | 32.3635 |
| 0.163         | 42.0  | 42000 | 2.9924          | 0.5284 | 0.3383 | 0.4754 | 0.4754    | 31.4935 |
| 0.163         | 43.0  | 43000 | 2.9798          | 0.5299 | 0.3403 | 0.4762 | 0.4765    | 31.8165 |
| 0.1583        | 44.0  | 44000 | 2.9919          | 0.5291 | 0.3397 | 0.4755 | 0.4759    | 31.6065 |
| 0.1537        | 45.0  | 45000 | 3.0308          | 0.5281 | 0.3381 | 0.4748 | 0.4749    | 31.447  |
| 0.1493        | 46.0  | 46000 | 3.0491          | 0.5287 | 0.339  | 0.4753 | 0.4755    | 31.944  |
| 0.1437        | 47.0  | 47000 | 3.0595          | 0.5282 | 0.3383 | 0.4744 | 0.4746    | 31.833  |
| 0.1437        | 48.0  | 48000 | 3.0804          | 0.5307 | 0.3401 | 0.477  | 0.4771    | 31.837  |
| 0.1435        | 49.0  | 49000 | 3.0782          | 0.5312 | 0.3406 | 0.4772 | 0.4772    | 31.798  |
| 0.1392        | 50.0  | 50000 | 3.0847          | 0.5303 | 0.3398 | 0.477  | 0.477     | 31.974  |


### Framework versions

- Transformers 4.45.1
- Pytorch 2.2.1
- Datasets 3.0.1
- Tokenizers 0.20.0