Edit model card

mt5-semantic

This model is a fine-tuned version of google/mt5-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0000
  • Rouge1: 1.0
  • Rouge2: 1.0
  • Rougel: 1.0
  • Rougelsum: 1.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 8
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 90

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
No log 1.0 1 7.1720 0.0 0.0 0.0 0.0
No log 2.0 2 6.2006 0.1333 0.0 0.1333 0.1333
No log 3.0 3 5.7428 0.1333 0.0 0.1333 0.1333
No log 4.0 4 3.8803 0.1333 0.0 0.1333 0.1333
No log 5.0 5 5.0966 0.0 0.0 0.0 0.0
No log 6.0 6 5.3543 0.0 0.0 0.0 0.0
No log 7.0 7 5.1282 0.0 0.0 0.0 0.0
No log 8.0 8 4.5391 0.2 0.0 0.2 0.2
No log 9.0 9 3.3062 0.25 0.0 0.25 0.25
No log 10.0 10 2.5865 0.25 0.0 0.25 0.25
No log 11.0 11 2.4655 0.0 0.0 0.0 0.0
No log 12.0 12 2.0948 0.0 0.0 0.0 0.0
No log 13.0 13 1.8355 0.25 0.1429 0.25 0.25
No log 14.0 14 1.5715 0.25 0.1429 0.25 0.25
No log 15.0 15 1.3096 0.3333 0.25 0.3333 0.3333
No log 16.0 16 1.1770 0.2105 0.1176 0.2105 0.2105
No log 17.0 17 1.0978 0.3158 0.2353 0.3158 0.3158
No log 18.0 18 1.2578 0.3158 0.2353 0.3158 0.3158
No log 19.0 19 1.0386 0.3158 0.2353 0.3158 0.3158
No log 20.0 20 0.7758 0.25 0.0 0.25 0.25
No log 21.0 21 0.6506 0.2222 0.125 0.2222 0.2222
No log 22.0 22 0.7023 0.2222 0.125 0.2222 0.2222
No log 23.0 23 0.6491 0.2222 0.125 0.2222 0.2222
No log 24.0 24 0.6289 0.2105 0.1176 0.2105 0.2105
No log 25.0 25 0.6138 0.4444 0.2857 0.4444 0.4444
No log 26.0 26 0.6190 0.4 0.25 0.4 0.4
No log 27.0 27 0.4749 0.2105 0.1176 0.2105 0.2105
No log 28.0 28 0.3589 0.25 0.0 0.25 0.25
No log 29.0 29 0.4597 0.3077 0.0 0.3077 0.3077
No log 30.0 30 0.3665 0.3077 0.0 0.3077 0.3077
No log 31.0 31 0.2642 0.3077 0.0 0.3077 0.3077
No log 32.0 32 0.1914 0.3077 0.0 0.3077 0.3077
No log 33.0 33 0.1114 1.0 1.0 1.0 1.0
No log 34.0 34 0.0375 1.0 1.0 1.0 1.0
No log 35.0 35 0.0187 1.0 1.0 1.0 1.0
No log 36.0 36 0.0111 1.0 1.0 1.0 1.0
No log 37.0 37 0.0046 1.0 1.0 1.0 1.0
No log 38.0 38 0.0021 1.0 1.0 1.0 1.0
No log 39.0 39 0.0012 1.0 1.0 1.0 1.0
No log 40.0 40 0.0007 1.0 1.0 1.0 1.0
No log 41.0 41 0.0004 1.0 1.0 1.0 1.0
No log 42.0 42 0.0005 1.0 1.0 1.0 1.0
No log 43.0 43 0.0007 1.0 1.0 1.0 1.0
No log 44.0 44 0.0009 1.0 1.0 1.0 1.0
No log 45.0 45 0.0016 1.0 1.0 1.0 1.0
No log 46.0 46 0.0008 1.0 1.0 1.0 1.0
No log 47.0 47 0.0002 1.0 1.0 1.0 1.0
No log 48.0 48 0.0000 1.0 1.0 1.0 1.0
No log 49.0 49 0.0000 1.0 1.0 1.0 1.0
No log 50.0 50 0.0000 1.0 1.0 1.0 1.0
No log 51.0 51 0.0000 1.0 1.0 1.0 1.0
No log 52.0 52 0.0000 1.0 1.0 1.0 1.0
No log 53.0 53 0.0000 1.0 1.0 1.0 1.0
No log 54.0 54 0.0000 1.0 1.0 1.0 1.0
No log 55.0 55 0.0000 1.0 1.0 1.0 1.0
No log 56.0 56 0.0000 1.0 1.0 1.0 1.0
No log 57.0 57 0.0000 1.0 1.0 1.0 1.0
No log 58.0 58 0.0000 1.0 1.0 1.0 1.0
No log 59.0 59 0.0000 1.0 1.0 1.0 1.0
No log 60.0 60 0.0000 1.0 1.0 1.0 1.0
No log 61.0 61 0.0000 1.0 1.0 1.0 1.0
No log 62.0 62 0.0000 1.0 1.0 1.0 1.0
No log 63.0 63 0.0000 1.0 1.0 1.0 1.0
No log 64.0 64 0.0000 1.0 1.0 1.0 1.0
No log 65.0 65 0.0000 1.0 1.0 1.0 1.0
No log 66.0 66 0.0000 1.0 1.0 1.0 1.0
No log 67.0 67 0.0000 1.0 1.0 1.0 1.0
No log 68.0 68 0.0000 1.0 1.0 1.0 1.0
No log 69.0 69 0.0000 1.0 1.0 1.0 1.0
No log 70.0 70 0.0000 1.0 1.0 1.0 1.0
No log 71.0 71 0.0000 1.0 1.0 1.0 1.0
No log 72.0 72 0.0000 1.0 1.0 1.0 1.0
No log 73.0 73 0.0000 1.0 1.0 1.0 1.0
No log 74.0 74 0.0000 1.0 1.0 1.0 1.0
No log 75.0 75 0.0000 1.0 1.0 1.0 1.0
No log 76.0 76 0.0000 1.0 1.0 1.0 1.0
No log 77.0 77 0.0000 1.0 1.0 1.0 1.0
No log 78.0 78 0.0000 1.0 1.0 1.0 1.0
No log 79.0 79 0.0000 1.0 1.0 1.0 1.0
No log 80.0 80 0.0000 1.0 1.0 1.0 1.0
No log 81.0 81 0.0000 1.0 1.0 1.0 1.0
No log 82.0 82 0.0000 1.0 1.0 1.0 1.0
No log 83.0 83 0.0000 1.0 1.0 1.0 1.0
No log 84.0 84 0.0000 1.0 1.0 1.0 1.0
No log 85.0 85 0.0000 1.0 1.0 1.0 1.0
No log 86.0 86 0.0000 1.0 1.0 1.0 1.0
No log 87.0 87 0.0000 1.0 1.0 1.0 1.0
No log 88.0 88 0.0000 1.0 1.0 1.0 1.0
No log 89.0 89 0.0000 1.0 1.0 1.0 1.0
No log 90.0 90 0.0000 1.0 1.0 1.0 1.0

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu118
  • Datasets 2.15.0
  • Tokenizers 0.15.0
Downloads last month
2
Safetensors
Model size
582M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for devagonal/mt5-semantic

Base model

google/mt5-base
Finetuned
(154)
this model