aiguy68's picture
End of training
03ed3df verified
|
raw
history blame
2.59 kB
metadata
license: mit
base_model: facebook/bart-large-cnn
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: Super_legal_text_summarizer
    results: []

Super_legal_text_summarizer

This model is a fine-tuned version of facebook/bart-large-cnn on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.8242
  • Rouge1: 0.4168
  • Rouge2: 0.1843
  • Rougel: 0.26
  • Rougelsum: 0.2614
  • Gen Len: 126.1232

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 3
  • eval_batch_size: 3
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 12
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 0.9889 67 2.0691 0.3965 0.1608 0.2317 0.2325 134.8522
No log 1.9926 135 1.9581 0.4184 0.1826 0.2539 0.255 133.4433
No log 2.9963 203 1.9041 0.4129 0.1792 0.2554 0.2563 127.0591
No log 4.0 271 1.8745 0.4111 0.1769 0.2579 0.2586 126.7635
No log 4.9889 338 1.8539 0.4122 0.1754 0.258 0.2586 126.0542
No log 5.9926 406 1.8414 0.4197 0.1806 0.2603 0.2613 130.8177
No log 6.9963 474 1.8334 0.4058 0.1712 0.2532 0.2539 126.1281
1.9669 8.0 542 1.8284 0.4129 0.1818 0.2587 0.2596 125.798
1.9669 8.9889 609 1.8246 0.4129 0.1802 0.257 0.2582 126.6158
1.9669 9.8893 670 1.8242 0.4168 0.1843 0.26 0.2614 126.1232

Framework versions

  • Transformers 4.40.0
  • Pytorch 2.2.1+cu121
  • Datasets 2.19.0
  • Tokenizers 0.19.1