Edit model card

flan-t5-large-extraction-cnndm_4000-all

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.8084
  • Rouge1: 35.2389
  • Rouge2: 15.2731
  • Rougel: 29.9899
  • Rougelsum: 30.0262
  • Gen Len: 19.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 24
  • seed: 1799
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.2214 0.4 200 1.9330 34.7186 15.2527 29.7852 29.8623 19.0
1.2119 0.8 400 1.9119 34.718 15.3471 29.4347 29.4709 19.0
1.1482 1.2 600 2.0060 34.1536 15.0233 29.503 29.518 18.99
1.1102 1.6 800 2.0276 34.8004 15.1277 29.5782 29.6371 18.998
1.1295 2.0 1000 1.9375 35.1942 15.2087 30.156 30.0925 18.996
1.2045 2.4 1200 1.9016 35.5121 15.8033 30.515 30.5451 18.984
1.492 2.8 1400 1.8119 35.0575 15.2373 29.8621 29.9106 19.0
1.4535 3.2 1600 1.8160 35.0796 15.6135 30.1449 30.189 19.0
1.4087 3.6 1800 1.8223 34.9121 15.3203 29.7578 29.8006 18.998
1.4098 4.0 2000 1.8084 35.2389 15.2731 29.9899 30.0262 19.0
1.3759 4.4 2200 1.8357 35.4492 15.8883 30.1135 30.151 19.0
1.3565 4.8 2400 1.8347 34.6559 15.2567 29.5659 29.5704 19.0
1.3268 5.2 2600 1.8416 35.326 15.5918 29.841 29.8391 19.0
1.3204 5.6 2800 1.8445 35.4671 15.5422 30.169 30.1985 19.0
1.3271 6.0 3000 1.8374 35.4057 15.6566 30.2378 30.2328 18.998

Framework versions

  • Transformers 4.18.0
  • Pytorch 1.10.0+cu111
  • Datasets 2.5.1
  • Tokenizers 0.12.1
Downloads last month
9
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.