flan-t5-base / README.md

update model card README.md

208b58a almost 2 years ago

4.23 kB

	---
	license: apache-2.0
	tags:
	- simplification
	- generated_from_trainer
	metrics:
	- rouge
	model-index:
	- name: flan-t5-base-clara-med
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# flan-t5-base-clara-med

	This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.2682
	- Rouge1: 28.7943
	- Rouge2: 16.031
	- Rougel: 26.7637
	- Rougelsum: 26.8047

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5.6e-05
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 30

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|:-------:\|:-------:\|:-------:\|:---------:\|
	\| No log \| 1.0 \| 380 \| 1.4589 \| 27.2058 \| 14.9978 \| 25.5534 \| 25.5731 \|
	\| No log \| 2.0 \| 760 \| 1.3896 \| 27.2408 \| 14.7703 \| 25.4948 \| 25.5166 \|
	\| 1.6471 \| 3.0 \| 1140 \| 1.3369 \| 27.4133 \| 14.8527 \| 25.6991 \| 25.6951 \|
	\| 1.6471 \| 4.0 \| 1520 \| 1.3050 \| 27.8128 \| 15.1101 \| 26.1084 \| 26.1375 \|
	\| 1.3668 \| 5.0 \| 1900 \| 1.2909 \| 27.8076 \| 15.3018 \| 26.0053 \| 26.0502 \|
	\| 1.3668 \| 6.0 \| 2280 \| 1.2732 \| 27.9007 \| 15.2226 \| 26.0983 \| 26.1265 \|
	\| 1.3668 \| 7.0 \| 2660 \| 1.2600 \| 27.5606 \| 14.8875 \| 25.6058 \| 25.6407 \|
	\| 1.2209 \| 8.0 \| 3040 \| 1.2499 \| 28.0251 \| 15.3935 \| 26.1269 \| 26.1526 \|
	\| 1.2209 \| 9.0 \| 3420 \| 1.2510 \| 28.2472 \| 15.5229 \| 26.2721 \| 26.2975 \|
	\| 1.1212 \| 10.0 \| 3800 \| 1.2485 \| 28.2394 \| 15.4929 \| 26.2322 \| 26.2664 \|
	\| 1.1212 \| 11.0 \| 4180 \| 1.2380 \| 28.3943 \| 15.4261 \| 26.4591 \| 26.5035 \|
	\| 1.1212 \| 12.0 \| 4560 \| 1.2373 \| 28.3341 \| 15.5314 \| 26.4204 \| 26.4567 \|
	\| 1.0353 \| 13.0 \| 4940 \| 1.2392 \| 28.3379 \| 15.7147 \| 26.4372 \| 26.4395 \|
	\| 1.0353 \| 14.0 \| 5320 \| 1.2436 \| 28.6789 \| 15.7709 \| 26.5923 \| 26.6221 \|
	\| 0.9837 \| 15.0 \| 5700 \| 1.2447 \| 28.801 \| 15.9612 \| 26.7568 \| 26.7808 \|
	\| 0.9837 \| 16.0 \| 6080 \| 1.2406 \| 28.3076 \| 15.5614 \| 26.3192 \| 26.3439 \|
	\| 0.9837 \| 17.0 \| 6460 \| 1.2450 \| 28.4635 \| 15.8162 \| 26.5962 \| 26.6047 \|
	\| 0.9314 \| 18.0 \| 6840 \| 1.2481 \| 28.3993 \| 15.63 \| 26.3544 \| 26.4098 \|
	\| 0.9314 \| 19.0 \| 7220 \| 1.2505 \| 28.4367 \| 15.8777 \| 26.4985 \| 26.5426 \|
	\| 0.8877 \| 20.0 \| 7600 \| 1.2536 \| 28.5426 \| 15.7746 \| 26.5987 \| 26.6552 \|
	\| 0.8877 \| 21.0 \| 7980 \| 1.2524 \| 28.8175 \| 16.1677 \| 26.8577 \| 26.9171 \|
	\| 0.8877 \| 22.0 \| 8360 \| 1.2604 \| 28.5719 \| 15.9639 \| 26.632 \| 26.659 \|
	\| 0.8577 \| 23.0 \| 8740 \| 1.2591 \| 28.7079 \| 15.878 \| 26.7358 \| 26.7978 \|
	\| 0.8577 \| 24.0 \| 9120 \| 1.2606 \| 28.6595 \| 15.9726 \| 26.6673 \| 26.7347 \|
	\| 0.8337 \| 25.0 \| 9500 \| 1.2686 \| 28.6858 \| 15.9056 \| 26.6485 \| 26.6785 \|
	\| 0.8337 \| 26.0 \| 9880 \| 1.2654 \| 28.6585 \| 16.0482 \| 26.688 \| 26.7329 \|
	\| 0.8337 \| 27.0 \| 10260 \| 1.2618 \| 28.7773 \| 15.9875 \| 26.6868 \| 26.7367 \|
	\| 0.8163 \| 28.0 \| 10640 \| 1.2668 \| 28.7499 \| 16.0041 \| 26.7845 \| 26.8112 \|
	\| 0.8163 \| 29.0 \| 11020 \| 1.2671 \| 28.7373 \| 15.9702 \| 26.7276 \| 26.763 \|
	\| 0.8087 \| 30.0 \| 11400 \| 1.2682 \| 28.7943 \| 16.031 \| 26.7637 \| 26.8047 \|


	### Framework versions

	- Transformers 4.25.1
	- Pytorch 1.13.0
	- Datasets 2.8.0
	- Tokenizers 0.12.1