haryoaw's picture
Initial Commit
b79a99b verified
metadata
license: mit
base_model: facebook/xlm-v-base
tags:
  - generated_from_trainer
metrics:
  - accuracy
  - f1
model-index:
  - name: scenario-TCR-XLMV-XCOPA-4_data-xcopa_all
    results: []

scenario-TCR-XLMV-XCOPA-4_data-xcopa_all

This model is a fine-tuned version of facebook/xlm-v-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6931
  • Accuracy: 0.5008
  • F1: 0.4628

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 4824
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 500

Training results

Training Loss Epoch Step Validation Loss Accuracy F1
No log 0.38 5 0.6932 0.4642 0.4462
No log 0.77 10 0.6932 0.485 0.4560
No log 1.15 15 0.6932 0.4992 0.4551
No log 1.54 20 0.6932 0.4983 0.4467
No log 1.92 25 0.6932 0.4958 0.4444
No log 2.31 30 0.6932 0.5075 0.4661
No log 2.69 35 0.6931 0.5042 0.4625
No log 3.08 40 0.6931 0.4967 0.4597
No log 3.46 45 0.6931 0.4908 0.4636
No log 3.85 50 0.6932 0.5008 0.4647
No log 4.23 55 0.6931 0.5067 0.4752
No log 4.62 60 0.6931 0.5033 0.4688
No log 5.0 65 0.6932 0.4875 0.4524
No log 5.38 70 0.6932 0.4517 0.4156
No log 5.77 75 0.6932 0.4667 0.4286
No log 6.15 80 0.6932 0.4683 0.4334
No log 6.54 85 0.6932 0.47 0.4382
No log 6.92 90 0.6932 0.4692 0.4437
No log 7.31 95 0.6931 0.4967 0.4766
No log 7.69 100 0.6931 0.53 0.5138
No log 8.08 105 0.6931 0.4858 0.4686
No log 8.46 110 0.6932 0.4767 0.4452
No log 8.85 115 0.6931 0.4617 0.4353
No log 9.23 120 0.6931 0.4683 0.4433
No log 9.62 125 0.6931 0.4717 0.4429
No log 10.0 130 0.6931 0.4858 0.4630
No log 10.38 135 0.6931 0.4983 0.4872
No log 10.77 140 0.6931 0.4958 0.4771
No log 11.15 145 0.6931 0.5108 0.4846
No log 11.54 150 0.6931 0.5075 0.4784
No log 11.92 155 0.6931 0.5267 0.4883
No log 12.31 160 0.6931 0.5142 0.4809
No log 12.69 165 0.6931 0.5108 0.4873
No log 13.08 170 0.6931 0.5075 0.4829
No log 13.46 175 0.6931 0.5042 0.4758
No log 13.85 180 0.6931 0.4825 0.4567
No log 14.23 185 0.6931 0.4625 0.4337
No log 14.62 190 0.6931 0.4783 0.4528
No log 15.0 195 0.6931 0.4767 0.4510
No log 15.38 200 0.6931 0.4675 0.4370
No log 15.77 205 0.6931 0.4675 0.4370
No log 16.15 210 0.6931 0.465 0.4368
No log 16.54 215 0.6931 0.4792 0.4522
No log 16.92 220 0.6931 0.4875 0.4591
No log 17.31 225 0.6931 0.4892 0.4628
No log 17.69 230 0.6931 0.4933 0.4591
No log 18.08 235 0.6931 0.5325 0.4978
No log 18.46 240 0.6931 0.5192 0.4750
No log 18.85 245 0.6931 0.5142 0.4836
No log 19.23 250 0.6931 0.5008 0.4628

Framework versions

  • Transformers 4.33.3
  • Pytorch 2.1.1+cu121
  • Datasets 2.14.5
  • Tokenizers 0.13.3