learn3r
/

longt5_xl_sfd_bp_15

Text2Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

longt5_xl_sfd_bp_15 / README.md

learn3r's picture

Model save

39f2060 verified 8 months ago

|

3.11 kB

	---
	license: apache-2.0
	base_model: google/long-t5-tglobal-xl
	tags:
	- generated_from_trainer
	metrics:
	- rouge
	model-index:
	- name: longt5_xl_sfd_bp_15
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# longt5_xl_sfd_bp_15

	This model is a fine-tuned version of [google/long-t5-tglobal-xl](https://huggingface.co/google/long-t5-tglobal-xl) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 2.5840
	- Rouge1: 29.7482
	- Rouge2: 12.0072
	- Rougel: 21.348
	- Rougelsum: 28.5849
	- Gen Len: 503.5769

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.001
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- gradient_accumulation_steps: 32
	- total_train_batch_size: 256
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: constant
	- num_epochs: 15.0

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \| Gen Len \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:-------:\|:-------:\|:-------:\|:---------:\|:--------:\|
	\| 2.5763 \| 0.97 \| 14 \| 2.5415 \| 10.6052 \| 1.4494 \| 10.4593 \| 10.4801 \| 509.6479 \|
	\| 1.8998 \| 1.95 \| 28 \| 1.7398 \| 16.7989 \| 4.1457 \| 16.4049 \| 15.1803 \| 511.0 \|
	\| 1.6403 \| 2.99 \| 43 \| 1.5457 \| 18.4716 \| 5.4633 \| 17.1393 \| 16.9242 \| 511.0 \|
	\| 1.5012 \| 3.97 \| 57 \| 1.5736 \| 18.2259 \| 5.3524 \| 17.0162 \| 16.7948 \| 511.0 \|
	\| 1.248 \| 4.94 \| 71 \| 1.5482 \| 20.8275 \| 6.7412 \| 18.0859 \| 19.3113 \| 511.0 \|
	\| 1.0176 \| 5.98 \| 86 \| 1.6254 \| 21.1937 \| 6.8813 \| 18.411 \| 19.8577 \| 510.6775 \|
	\| 0.8472 \| 6.96 \| 100 \| 1.6212 \| 26.1873 \| 9.1581 \| 20.393 \| 24.1393 \| 479.9704 \|
	\| 0.7242 \| 8.0 \| 115 \| 1.7231 \| 23.5881 \| 7.8961 \| 18.7014 \| 22.2999 \| 506.9112 \|
	\| 0.5876 \| 8.97 \| 129 \| 1.9401 \| 32.1851 \| 12.6426 \| 22.8358 \| 30.6718 \| 451.6982 \|
	\| 0.4756 \| 9.95 \| 143 \| 1.9001 \| 31.353 \| 12.994 \| 23.1542 \| 29.8375 \| 455.5947 \|
	\| 0.4042 \| 10.99 \| 158 \| 2.1295 \| 28.6425 \| 11.8399 \| 21.3847 \| 27.0508 \| 497.5355 \|
	\| 0.3292 \| 11.97 \| 172 \| 2.2441 \| 31.8393 \| 13.1308 \| 22.135 \| 30.5866 \| 478.8107 \|
	\| 0.2812 \| 12.94 \| 186 \| 2.3464 \| 34.4102 \| 14.3607 \| 23.8634 \| 32.9732 \| 429.9911 \|
	\| 0.2443 \| 13.98 \| 201 \| 2.2003 \| 34.8239 \| 14.8042 \| 25.2438 \| 33.0469 \| 392.5385 \|
	\| 0.1958 \| 14.61 \| 210 \| 2.5840 \| 29.7482 \| 12.0072 \| 21.348 \| 28.5849 \| 503.5769 \|


	### Framework versions

	- Transformers 4.38.1
	- Pytorch 2.2.1+cu121
	- Datasets 2.17.1
	- Tokenizers 0.15.2