metadata

license: apache-2.0
tags:
  - generated_from_trainer
model-index:
  - name: t5-small-entailement-Writer-T5-small
    results: []

t5-small-entailement-Writer-T5-small

This model is a fine-tuned version of t5-small on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.5628

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
No log	1.0	83	1.2943
No log	2.0	166	0.9323
No log	3.0	249	0.8443
No log	4.0	332	0.7884
No log	5.0	415	0.7582
No log	6.0	498	0.7355
1.2761	7.0	581	0.7178
1.2761	8.0	664	0.7105
1.2761	9.0	747	0.6972
1.2761	10.0	830	0.6847
1.2761	11.0	913	0.6774
1.2761	12.0	996	0.6708
0.7765	13.0	1079	0.6609
0.7765	14.0	1162	0.6566
0.7765	15.0	1245	0.6507
0.7765	16.0	1328	0.6454
0.7765	17.0	1411	0.6438
0.7765	18.0	1494	0.6384
0.693	19.0	1577	0.6347
0.693	20.0	1660	0.6321
0.693	21.0	1743	0.6254
0.693	22.0	1826	0.6237
0.693	23.0	1909	0.6215
0.693	24.0	1992	0.6167
0.6504	25.0	2075	0.6167
0.6504	26.0	2158	0.6131
0.6504	27.0	2241	0.6120
0.6504	28.0	2324	0.6091
0.6504	29.0	2407	0.6076
0.6504	30.0	2490	0.6058
0.615	31.0	2573	0.6031
0.615	32.0	2656	0.6015
0.615	33.0	2739	0.6015
0.615	34.0	2822	0.6000
0.615	35.0	2905	0.5998
0.615	36.0	2988	0.5969
0.586	37.0	3071	0.5959
0.586	38.0	3154	0.5941
0.586	39.0	3237	0.5923
0.586	40.0	3320	0.5936
0.586	41.0	3403	0.5929
0.586	42.0	3486	0.5922
0.5618	43.0	3569	0.5910
0.5618	44.0	3652	0.5885
0.5618	45.0	3735	0.5879
0.5618	46.0	3818	0.5873
0.5618	47.0	3901	0.5877
0.5618	48.0	3984	0.5878
0.5418	49.0	4067	0.5881
0.5418	50.0	4150	0.5858
0.5418	51.0	4233	0.5847
0.5418	52.0	4316	0.5839
0.5418	53.0	4399	0.5843
0.5418	54.0	4482	0.5826
0.5283	55.0	4565	0.5843
0.5283	56.0	4648	0.5833
0.5283	57.0	4731	0.5825
0.5283	58.0	4814	0.5827
0.5283	59.0	4897	0.5830
0.5283	60.0	4980	0.5806
0.5135	61.0	5063	0.5808
0.5135	62.0	5146	0.5806
0.5135	63.0	5229	0.5807
0.5135	64.0	5312	0.5823
0.5135	65.0	5395	0.5801
0.5135	66.0	5478	0.5799
0.5053	67.0	5561	0.5808
0.5053	68.0	5644	0.5796
0.5053	69.0	5727	0.5793
0.5053	70.0	5810	0.5785
0.5053	71.0	5893	0.5790
0.5053	72.0	5976	0.5775
0.4985	73.0	6059	0.5770
0.4985	74.0	6142	0.5777
0.4985	75.0	6225	0.5780
0.4985	76.0	6308	0.5779
0.4985	77.0	6391	0.5782
0.4985	78.0	6474	0.5773
0.4889	79.0	6557	0.5787
0.4889	80.0	6640	0.5787
0.4889	81.0	6723	0.5773
0.4889	82.0	6806	0.5777
0.4889	83.0	6889	0.5759
0.4889	84.0	6972	0.5765
0.4806	85.0	7055	0.5758
0.4806	86.0	7138	0.5760
0.4806	87.0	7221	0.5758
0.4806	88.0	7304	0.5760
0.4806	89.0	7387	0.5759
0.4806	90.0	7470	0.5758
0.4817	91.0	7553	0.5753
0.4817	92.0	7636	0.5757
0.4817	93.0	7719	0.5754
0.4817	94.0	7802	0.5750
0.4817	95.0	7885	0.5753
0.4817	96.0	7968	0.5752
0.4767	97.0	8051	0.5754
0.4767	98.0	8134	0.5756
0.4767	99.0	8217	0.5755
0.4767	100.0	8300	0.5755

Framework versions

Transformers 4.24.0
Pytorch 1.12.1+cu113
Datasets 2.7.1
Tokenizers 0.13.2