File size: 3,107 Bytes
39f2060
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
---
license: apache-2.0
base_model: google/long-t5-tglobal-xl
tags:
- generated_from_trainer
metrics:
- rouge
model-index:
- name: longt5_xl_sfd_bp_15
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# longt5_xl_sfd_bp_15

This model is a fine-tuned version of [google/long-t5-tglobal-xl](https://huggingface.co/google/long-t5-tglobal-xl) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 2.5840
- Rouge1: 29.7482
- Rouge2: 12.0072
- Rougel: 21.348
- Rougelsum: 28.5849
- Gen Len: 503.5769

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 32
- total_train_batch_size: 256
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: constant
- num_epochs: 15.0

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum | Gen Len  |
|:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:--------:|
| 2.5763        | 0.97  | 14   | 2.5415          | 10.6052 | 1.4494  | 10.4593 | 10.4801   | 509.6479 |
| 1.8998        | 1.95  | 28   | 1.7398          | 16.7989 | 4.1457  | 16.4049 | 15.1803   | 511.0    |
| 1.6403        | 2.99  | 43   | 1.5457          | 18.4716 | 5.4633  | 17.1393 | 16.9242   | 511.0    |
| 1.5012        | 3.97  | 57   | 1.5736          | 18.2259 | 5.3524  | 17.0162 | 16.7948   | 511.0    |
| 1.248         | 4.94  | 71   | 1.5482          | 20.8275 | 6.7412  | 18.0859 | 19.3113   | 511.0    |
| 1.0176        | 5.98  | 86   | 1.6254          | 21.1937 | 6.8813  | 18.411  | 19.8577   | 510.6775 |
| 0.8472        | 6.96  | 100  | 1.6212          | 26.1873 | 9.1581  | 20.393  | 24.1393   | 479.9704 |
| 0.7242        | 8.0   | 115  | 1.7231          | 23.5881 | 7.8961  | 18.7014 | 22.2999   | 506.9112 |
| 0.5876        | 8.97  | 129  | 1.9401          | 32.1851 | 12.6426 | 22.8358 | 30.6718   | 451.6982 |
| 0.4756        | 9.95  | 143  | 1.9001          | 31.353  | 12.994  | 23.1542 | 29.8375   | 455.5947 |
| 0.4042        | 10.99 | 158  | 2.1295          | 28.6425 | 11.8399 | 21.3847 | 27.0508   | 497.5355 |
| 0.3292        | 11.97 | 172  | 2.2441          | 31.8393 | 13.1308 | 22.135  | 30.5866   | 478.8107 |
| 0.2812        | 12.94 | 186  | 2.3464          | 34.4102 | 14.3607 | 23.8634 | 32.9732   | 429.9911 |
| 0.2443        | 13.98 | 201  | 2.2003          | 34.8239 | 14.8042 | 25.2438 | 33.0469   | 392.5385 |
| 0.1958        | 14.61 | 210  | 2.5840          | 29.7482 | 12.0072 | 21.348  | 28.5849   | 503.5769 |


### Framework versions

- Transformers 4.38.1
- Pytorch 2.2.1+cu121
- Datasets 2.17.1
- Tokenizers 0.15.2