csikasote's picture
Update metadata with huggingface_hub
331a597 verified
metadata
base_model: openai/whisper-large-v3
library_name: transformers
license: apache-2.0
metrics:
  - wer
tags:
  - generated_from_trainer
model-index:
  - name: whisper-large-v3-genbed-f
    results:
      - task:
          type: automatic-speech-recognition
          name: Automatic Speech Recognition
        dataset:
          name: BembaSpeech
          type: BembaSpeech
          config: en
          split: test
        metrics:
          - type: wer
            value: 21.76
            name: WER

whisper-large-v3-genbed-f

This model is a fine-tuned version of openai/whisper-large-v3 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4613
  • Wer: 28.2294

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1.75e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 2500
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
0.4575 0.6605 250 0.5118 48.6061
0.3575 1.3210 500 0.4580 41.5408
0.3229 1.9815 750 0.3920 34.9542
0.1937 2.6420 1000 0.4103 33.1986
0.0955 3.3025 1250 0.4218 32.8368
0.0943 3.9630 1500 0.4120 31.6982
0.0346 4.6235 1750 0.4397 30.2724
0.0123 5.2840 2000 0.4604 28.8891
0.0132 5.9445 2250 0.4485 29.1658
0.0025 6.6050 2500 0.4613 28.2294

Framework versions

  • Transformers 4.45.0.dev0
  • Pytorch 2.4.0+cu121
  • Datasets 2.21.0
  • Tokenizers 0.19.1