metadata

language:
  - vi
license: apache-2.0
base_model: openai/whisper-base
tags:
  - whisper-event
  - generated_from_trainer
datasets:
  - mozilla-foundation/common_voice_16_0
metrics:
  - wer
model-index:
  - name: Whisper Base Vietnamese
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: mozilla-foundation/common_voice_16_0 vi
          type: mozilla-foundation/common_voice_16_0
          config: vi
          split: test
          args: vi
        metrics:
          - name: Wer
            type: wer
            value: 37.80239886155723

Whisper Base Vietnamese

This model is a fine-tuned version of openai/whisper-base on the mozilla-foundation/common_voice_16_0 vi dataset. It achieves the following results on the evaluation set:

Loss: 0.7770
Wer: 37.8024

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-07
train_batch_size: 32
eval_batch_size: 32
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
training_steps: 10000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
0.6043	33.0	500	0.9039	42.6408
0.2836	66.0	1000	0.7761	38.3106
0.1593	99.0	1500	0.7770	37.8024
0.0835	133.0	2000	0.8019	37.8634
0.0395	166.0	2500	0.8317	38.1582
0.0217	199.0	3000	0.8563	38.2395
0.0146	233.0	3500	0.8744	38.2801
0.0107	266.0	4000	0.8893	38.4733
0.0082	299.0	4500	0.9031	38.3310
0.0065	333.0	5000	0.9155	38.4326
0.0053	366.0	5500	0.9267	38.6156
0.0044	399.0	6000	0.9381	38.7579
0.0037	433.0	6500	0.9486	38.7782
0.0032	466.0	7000	0.9580	39.0120
0.0028	499.0	7500	0.9669	39.1441
0.0025	533.0	8000	0.9747	39.1746
0.0022	566.0	8500	0.9810	39.2864
0.0021	599.0	9000	0.9866	39.2763
0.002	633.0	9500	0.9899	39.3271
0.0019	666.0	10000	0.9911	39.3271

Framework versions

Transformers 4.37.0.dev0
Pytorch 2.1.2+cu121
Datasets 2.16.2.dev0
Tokenizers 0.15.0