Sandipan1994's picture
update model card README.md
16e12ed
metadata
license: apache-2.0
tags:
  - generated_from_trainer
model-index:
  - name: t5-small-entailement-Writer-T5-base
    results: []

t5-small-entailement-Writer-T5-base

This model is a fine-tuned version of t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5697

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 250
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 42 1.8185
No log 2.0 84 1.1957
No log 3.0 126 0.9771
No log 4.0 168 0.8964
No log 5.0 210 0.8380
No log 6.0 252 0.8109
No log 7.0 294 0.7886
No log 8.0 336 0.7760
No log 9.0 378 0.7577
No log 10.0 420 0.7483
No log 11.0 462 0.7364
1.2044 12.0 504 0.7267
1.2044 13.0 546 0.7205
1.2044 14.0 588 0.7102
1.2044 15.0 630 0.7048
1.2044 16.0 672 0.7015
1.2044 17.0 714 0.6958
1.2044 18.0 756 0.6892
1.2044 19.0 798 0.6877
1.2044 20.0 840 0.6825
1.2044 21.0 882 0.6790
1.2044 22.0 924 0.6732
1.2044 23.0 966 0.6676
0.736 24.0 1008 0.6640
0.736 25.0 1050 0.6631
0.736 26.0 1092 0.6617
0.736 27.0 1134 0.6556
0.736 28.0 1176 0.6551
0.736 29.0 1218 0.6545
0.736 30.0 1260 0.6483
0.736 31.0 1302 0.6493
0.736 32.0 1344 0.6488
0.736 33.0 1386 0.6434
0.736 34.0 1428 0.6427
0.736 35.0 1470 0.6403
0.6568 36.0 1512 0.6364
0.6568 37.0 1554 0.6342
0.6568 38.0 1596 0.6325
0.6568 39.0 1638 0.6300
0.6568 40.0 1680 0.6302
0.6568 41.0 1722 0.6292
0.6568 42.0 1764 0.6264
0.6568 43.0 1806 0.6272
0.6568 44.0 1848 0.6252
0.6568 45.0 1890 0.6229
0.6568 46.0 1932 0.6221
0.6568 47.0 1974 0.6202
0.602 48.0 2016 0.6193
0.602 49.0 2058 0.6196
0.602 50.0 2100 0.6174
0.602 51.0 2142 0.6175
0.602 52.0 2184 0.6162
0.602 53.0 2226 0.6155
0.602 54.0 2268 0.6129
0.602 55.0 2310 0.6139
0.602 56.0 2352 0.6124
0.602 57.0 2394 0.6128
0.602 58.0 2436 0.6109
0.602 59.0 2478 0.6111
0.5653 60.0 2520 0.6097
0.5653 61.0 2562 0.6086
0.5653 62.0 2604 0.6083
0.5653 63.0 2646 0.6086
0.5653 64.0 2688 0.6090
0.5653 65.0 2730 0.6074
0.5653 66.0 2772 0.6064
0.5653 67.0 2814 0.6056
0.5653 68.0 2856 0.6039
0.5653 69.0 2898 0.6051
0.5653 70.0 2940 0.6043
0.5653 71.0 2982 0.6034
0.5368 72.0 3024 0.6020
0.5368 73.0 3066 0.6047
0.5368 74.0 3108 0.6031
0.5368 75.0 3150 0.6011
0.5368 76.0 3192 0.6027
0.5368 77.0 3234 0.6009
0.5368 78.0 3276 0.6003
0.5368 79.0 3318 0.6001
0.5368 80.0 3360 0.6008
0.5368 81.0 3402 0.6005
0.5368 82.0 3444 0.6007
0.5368 83.0 3486 0.5988
0.5055 84.0 3528 0.5991
0.5055 85.0 3570 0.6004
0.5055 86.0 3612 0.5989
0.5055 87.0 3654 0.5975
0.5055 88.0 3696 0.5977
0.5055 89.0 3738 0.5982
0.5055 90.0 3780 0.5964
0.5055 91.0 3822 0.5979
0.5055 92.0 3864 0.5996
0.5055 93.0 3906 0.5936
0.5055 94.0 3948 0.5956
0.5055 95.0 3990 0.5940
0.4866 96.0 4032 0.5961
0.4866 97.0 4074 0.5955
0.4866 98.0 4116 0.5949
0.4866 99.0 4158 0.5971
0.4866 100.0 4200 0.5958
0.4866 101.0 4242 0.5978
0.4866 102.0 4284 0.5971
0.4866 103.0 4326 0.5954
0.4866 104.0 4368 0.5933
0.4866 105.0 4410 0.5944
0.4866 106.0 4452 0.5952
0.4866 107.0 4494 0.5948
0.4657 108.0 4536 0.5951
0.4657 109.0 4578 0.5948
0.4657 110.0 4620 0.5948
0.4657 111.0 4662 0.5927
0.4657 112.0 4704 0.5931
0.4657 113.0 4746 0.5919
0.4657 114.0 4788 0.5939
0.4657 115.0 4830 0.5922
0.4657 116.0 4872 0.5921
0.4657 117.0 4914 0.5917
0.4657 118.0 4956 0.5913
0.4657 119.0 4998 0.5908
0.4468 120.0 5040 0.5929
0.4468 121.0 5082 0.5915
0.4468 122.0 5124 0.5926
0.4468 123.0 5166 0.5929
0.4468 124.0 5208 0.5911
0.4468 125.0 5250 0.5907
0.4468 126.0 5292 0.5921
0.4468 127.0 5334 0.5917
0.4468 128.0 5376 0.5923
0.4468 129.0 5418 0.5912
0.4468 130.0 5460 0.5930
0.4346 131.0 5502 0.5924
0.4346 132.0 5544 0.5933
0.4346 133.0 5586 0.5920
0.4346 134.0 5628 0.5937
0.4346 135.0 5670 0.5930
0.4346 136.0 5712 0.5930
0.4346 137.0 5754 0.5929
0.4346 138.0 5796 0.5916
0.4346 139.0 5838 0.5935
0.4346 140.0 5880 0.5947
0.4346 141.0 5922 0.5926
0.4346 142.0 5964 0.5930
0.4247 143.0 6006 0.5911
0.4247 144.0 6048 0.5916
0.4247 145.0 6090 0.5929
0.4247 146.0 6132 0.5926
0.4247 147.0 6174 0.5917
0.4247 148.0 6216 0.5913
0.4247 149.0 6258 0.5907
0.4247 150.0 6300 0.5930
0.4247 151.0 6342 0.5928
0.4247 152.0 6384 0.5922
0.4247 153.0 6426 0.5921
0.4247 154.0 6468 0.5925
0.4139 155.0 6510 0.5923
0.4139 156.0 6552 0.5919
0.4139 157.0 6594 0.5920
0.4139 158.0 6636 0.5935
0.4139 159.0 6678 0.5926
0.4139 160.0 6720 0.5926
0.4139 161.0 6762 0.5925
0.4139 162.0 6804 0.5927
0.4139 163.0 6846 0.5918
0.4139 164.0 6888 0.5925
0.4139 165.0 6930 0.5935
0.4139 166.0 6972 0.5926
0.4049 167.0 7014 0.5919
0.4049 168.0 7056 0.5917
0.4049 169.0 7098 0.5916
0.4049 170.0 7140 0.5925
0.4049 171.0 7182 0.5931
0.4049 172.0 7224 0.5938
0.4049 173.0 7266 0.5932
0.4049 174.0 7308 0.5927
0.4049 175.0 7350 0.5934
0.4049 176.0 7392 0.5931
0.4049 177.0 7434 0.5937
0.4049 178.0 7476 0.5939
0.397 179.0 7518 0.5939
0.397 180.0 7560 0.5932
0.397 181.0 7602 0.5935
0.397 182.0 7644 0.5939
0.397 183.0 7686 0.5935
0.397 184.0 7728 0.5945
0.397 185.0 7770 0.5932
0.397 186.0 7812 0.5931
0.397 187.0 7854 0.5925
0.397 188.0 7896 0.5934
0.397 189.0 7938 0.5941
0.397 190.0 7980 0.5939
0.3891 191.0 8022 0.5933
0.3891 192.0 8064 0.5934
0.3891 193.0 8106 0.5938
0.3891 194.0 8148 0.5944
0.3891 195.0 8190 0.5937
0.3891 196.0 8232 0.5939
0.3891 197.0 8274 0.5937
0.3891 198.0 8316 0.5947
0.3891 199.0 8358 0.5945
0.3891 200.0 8400 0.5946
0.3891 201.0 8442 0.5945
0.3891 202.0 8484 0.5938
0.3842 203.0 8526 0.5947
0.3842 204.0 8568 0.5945
0.3842 205.0 8610 0.5935
0.3842 206.0 8652 0.5935
0.3842 207.0 8694 0.5939
0.3842 208.0 8736 0.5938
0.3842 209.0 8778 0.5939
0.3842 210.0 8820 0.5940
0.3842 211.0 8862 0.5943
0.3842 212.0 8904 0.5943
0.3842 213.0 8946 0.5946
0.3842 214.0 8988 0.5946
0.3802 215.0 9030 0.5947
0.3802 216.0 9072 0.5949
0.3802 217.0 9114 0.5944
0.3802 218.0 9156 0.5946
0.3802 219.0 9198 0.5950
0.3802 220.0 9240 0.5950
0.3802 221.0 9282 0.5953
0.3802 222.0 9324 0.5951
0.3802 223.0 9366 0.5956
0.3802 224.0 9408 0.5952
0.3802 225.0 9450 0.5955
0.3802 226.0 9492 0.5958
0.3791 227.0 9534 0.5954
0.3791 228.0 9576 0.5953
0.3791 229.0 9618 0.5959
0.3791 230.0 9660 0.5959
0.3791 231.0 9702 0.5957
0.3791 232.0 9744 0.5957
0.3791 233.0 9786 0.5956
0.3791 234.0 9828 0.5956
0.3791 235.0 9870 0.5956
0.3791 236.0 9912 0.5956
0.3791 237.0 9954 0.5957
0.3791 238.0 9996 0.5960
0.3764 239.0 10038 0.5956
0.3764 240.0 10080 0.5956
0.3764 241.0 10122 0.5955
0.3764 242.0 10164 0.5956
0.3764 243.0 10206 0.5955
0.3764 244.0 10248 0.5957
0.3764 245.0 10290 0.5956
0.3764 246.0 10332 0.5955
0.3764 247.0 10374 0.5954
0.3764 248.0 10416 0.5955
0.3764 249.0 10458 0.5954
0.3763 250.0 10500 0.5954

Framework versions

  • Transformers 4.25.1
  • Pytorch 1.13.0+cu116
  • Datasets 2.7.1
  • Tokenizers 0.13.2