ft5-bleu-durga-q1-clean

This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 24
eval_batch_size: 24
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 30

Training Loss	Epoch	Step	Validation Loss	Bleu
3.0328	1.0	3	2.0694	0.0089
1.8606	2.0	6	1.8619	0.0161
2.8564	3.0	9	1.6705	0.0171
1.4505	4.0	12	1.5098	0.0110
1.5253	5.0	15	1.3533	0.0157
1.9042	6.0	18	1.2238	0.0269
1.7751	7.0	21	1.1186	0.0307
1.1806	8.0	24	1.0301	0.0299
1.0575	9.0	27	0.9449	0.0399
0.9038	10.0	30	0.8609	0.0441
1.4179	11.0	33	0.7875	0.0409
1.435	12.0	36	0.7099	0.0473
0.6608	13.0	39	0.6601	0.0613
0.4501	14.0	42	0.6165	0.0661
0.5584	15.0	45	0.5831	0.0654
1.1468	16.0	48	0.5449	0.0765
0.5354	17.0	51	0.5032	0.0657
0.8386	18.0	54	0.4703	0.0599
0.7596	19.0	57	0.4393	0.0768
0.9838	20.0	60	0.4084	0.0894
1.0146	21.0	63	0.3784	0.0832
0.6939	22.0	66	0.3560	0.0947
0.7661	23.0	69	0.3385	0.0965
0.9235	24.0	72	0.3231	0.1130
0.5005	25.0	75	0.3099	0.1199
0.382	26.0	78	0.2982	0.1248
0.5445	27.0	81	0.2882	0.1305
0.5974	28.0	84	0.2816	0.1305
0.6571	29.0	87	0.2775	0.1299
0.5589	30.0	90	0.2757	0.1299