jordanfan's picture
training completed[dev]: 1024 128
4f95dbb verified
metadata
license: apache-2.0
base_model: facebook/bart-large
tags:
  - generated_from_trainer
metrics:
  - rouge
  - wer
model-index:
  - name: bart_billsum_abstractive_1024_1000
    results: []

bart_billsum_abstractive_1024_1000

This model is a fine-tuned version of facebook/bart-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0789
  • Rouge1: 0.6795
  • Rouge2: 0.4076
  • Rougel: 0.6139
  • Rougelsum: 0.6139
  • Wer: 0.4803
  • Bleurt: -0.0583

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 6
  • eval_batch_size: 6
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 2
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Wer Bleurt
No log 0.14 250 1.3122 0.6345 0.3515 0.5637 0.5638 0.5303 -0.3533
2.3005 0.27 500 1.2468 0.6452 0.3662 0.5767 0.5767 0.5174 -0.4992
2.3005 0.41 750 1.1909 0.6513 0.3745 0.5823 0.5823 0.5094 -0.4679
1.3108 0.55 1000 1.1685 0.6605 0.3827 0.5928 0.5928 0.5037 -0.1431
1.3108 0.68 1250 1.1505 0.6671 0.3894 0.5984 0.5984 0.4996 -0.0701
1.2615 0.82 1500 1.1334 0.6616 0.3883 0.5949 0.5949 0.4953 -0.3277
1.2615 0.96 1750 1.1226 0.6692 0.3948 0.6035 0.6035 0.492 -0.0701
1.1939 1.09 2000 1.1148 0.6669 0.3942 0.6007 0.6007 0.4892 -0.2128
1.1939 1.23 2250 1.1110 0.6741 0.4003 0.6072 0.6072 0.4884 -0.3492
1.1268 1.36 2500 1.1111 0.6746 0.4018 0.6093 0.6094 0.4865 -0.0701
1.1268 1.5 2750 1.0927 0.6717 0.4001 0.6054 0.6054 0.4837 -0.467
1.0977 1.64 3000 1.0840 0.6756 0.4048 0.6099 0.61 0.4814 -0.2661
1.0977 1.77 3250 1.0834 0.673 0.4034 0.6077 0.6077 0.4808 -0.2082
1.079 1.91 3500 1.0789 0.6795 0.4076 0.6139 0.6139 0.4803 -0.0583

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2