jordanfan
/

bart_billsum_abstractive_1024_1000

Text2Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

bart_billsum_abstractive_1024_1000 / README.md

jordanfan's picture

training completed[dev]: 1024 128

4f95dbb verified 8 months ago

|

history blame contribute delete

3.08 kB

	---
	license: apache-2.0
	base_model: facebook/bart-large
	tags:
	- generated_from_trainer
	metrics:
	- rouge
	- wer
	model-index:
	- name: bart_billsum_abstractive_1024_1000
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# bart_billsum_abstractive_1024_1000

	This model is a fine-tuned version of [facebook/bart-large](https://huggingface.co/facebook/bart-large) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.0789
	- Rouge1: 0.6795
	- Rouge2: 0.4076
	- Rougel: 0.6139
	- Rougelsum: 0.6139
	- Wer: 0.4803
	- Bleurt: -0.0583

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 6
	- eval_batch_size: 6
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 2
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \| Wer \| Bleurt \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:------:\|:------:\|:------:\|:---------:\|:------:\|:-------:\|
	\| No log \| 0.14 \| 250 \| 1.3122 \| 0.6345 \| 0.3515 \| 0.5637 \| 0.5638 \| 0.5303 \| -0.3533 \|
	\| 2.3005 \| 0.27 \| 500 \| 1.2468 \| 0.6452 \| 0.3662 \| 0.5767 \| 0.5767 \| 0.5174 \| -0.4992 \|
	\| 2.3005 \| 0.41 \| 750 \| 1.1909 \| 0.6513 \| 0.3745 \| 0.5823 \| 0.5823 \| 0.5094 \| -0.4679 \|
	\| 1.3108 \| 0.55 \| 1000 \| 1.1685 \| 0.6605 \| 0.3827 \| 0.5928 \| 0.5928 \| 0.5037 \| -0.1431 \|
	\| 1.3108 \| 0.68 \| 1250 \| 1.1505 \| 0.6671 \| 0.3894 \| 0.5984 \| 0.5984 \| 0.4996 \| -0.0701 \|
	\| 1.2615 \| 0.82 \| 1500 \| 1.1334 \| 0.6616 \| 0.3883 \| 0.5949 \| 0.5949 \| 0.4953 \| -0.3277 \|
	\| 1.2615 \| 0.96 \| 1750 \| 1.1226 \| 0.6692 \| 0.3948 \| 0.6035 \| 0.6035 \| 0.492 \| -0.0701 \|
	\| 1.1939 \| 1.09 \| 2000 \| 1.1148 \| 0.6669 \| 0.3942 \| 0.6007 \| 0.6007 \| 0.4892 \| -0.2128 \|
	\| 1.1939 \| 1.23 \| 2250 \| 1.1110 \| 0.6741 \| 0.4003 \| 0.6072 \| 0.6072 \| 0.4884 \| -0.3492 \|
	\| 1.1268 \| 1.36 \| 2500 \| 1.1111 \| 0.6746 \| 0.4018 \| 0.6093 \| 0.6094 \| 0.4865 \| -0.0701 \|
	\| 1.1268 \| 1.5 \| 2750 \| 1.0927 \| 0.6717 \| 0.4001 \| 0.6054 \| 0.6054 \| 0.4837 \| -0.467 \|
	\| 1.0977 \| 1.64 \| 3000 \| 1.0840 \| 0.6756 \| 0.4048 \| 0.6099 \| 0.61 \| 0.4814 \| -0.2661 \|
	\| 1.0977 \| 1.77 \| 3250 \| 1.0834 \| 0.673 \| 0.4034 \| 0.6077 \| 0.6077 \| 0.4808 \| -0.2082 \|
	\| 1.079 \| 1.91 \| 3500 \| 1.0789 \| 0.6795 \| 0.4076 \| 0.6139 \| 0.6139 \| 0.4803 \| -0.0583 \|


	### Framework versions

	- Transformers 4.38.2
	- Pytorch 2.2.1+cu121
	- Datasets 2.18.0
	- Tokenizers 0.15.2