shaojieli's picture
Update README.md
91ee85c
metadata
license: apache-2.0
datasets:
  - mozilla-foundation/common_voice_12_0
language:
  - fr
metrics:
  - wer
pipeline_tag: automatic-speech-recognition

training on full commonvoice

The WERs are:

decoding method chunk size test comment decoding mode
greedy search 640ms 10.90 --epoch 30 --avg 9 simulated streaming
modified beam search 640ms 10.55 --epoch 30 --avg 9 simulated streaming
fast beam search 640ms 10.75 --epoch 30 --avg 9 simulated streaming

training on full librispeech then finetune on full commonvoice

The WERs are:

decoding method chunk size test comment decoding mode
greedy search 640ms 10.57 --epoch 29 --avg 9 simulated streaming
modified beam search 640ms 10.19 --epoch 29 --avg 9 simulated streaming
fast beam search 640ms 10.25 --epoch 29 --avg 9 simulated streaming

training on full librispeech and gigaspeech then finetune on full commonvoice

The WERs are:

decoding method chunk size test comment decoding mode
greedy search 640ms 9.95 --epoch 30 --avg 9 simulated streaming
modified beam search 640ms 9.57 --epoch 30 --avg 9 simulated streaming
fast beam search 640ms 9.67 --epoch 30 --avg 9 simulated streaming