File size: 1,767 Bytes
3aa9274
 
f78a2b3
 
 
 
 
 
 
f58997a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
91ee85c
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
---
license: apache-2.0
datasets:
- mozilla-foundation/common_voice_12_0
language:
- fr
metrics:
- wer
pipeline_tag: automatic-speech-recognition
---

training on full commonvoice

The WERs are:

| decoding method      | chunk size |  test  |        comment       |     decoding mode    |
| -------------------- | ---------- | ------ | -------------------- | -------------------- |
| greedy search        | 640ms      | 10.90  | --epoch 30 --avg 9   | simulated streaming  |
| modified beam search | 640ms      | 10.55  | --epoch 30 --avg 9   | simulated streaming  |
| fast beam search     | 640ms      | 10.75  | --epoch 30 --avg 9   | simulated streaming  |

training on full librispeech then finetune on full commonvoice

The WERs are:

| decoding method      | chunk size |  test  |        comment       |     decoding mode    |
| -------------------- | ---------- | ------ | -------------------- | -------------------- |
| greedy search        | 640ms      | 10.57  | --epoch 29 --avg 9   | simulated streaming  |
| modified beam search | 640ms      | 10.19  | --epoch 29 --avg 9   | simulated streaming  |
| fast beam search     | 640ms      | 10.25  | --epoch 29 --avg 9   | simulated streaming  |

training on full librispeech and gigaspeech then finetune on full commonvoice

The WERs are:

| decoding method      | chunk size |  test  |        comment       |     decoding mode    |
| -------------------- | ---------- | ------ | -------------------- | -------------------- |
| greedy search        | 640ms      | 9.95   | --epoch 30 --avg 9   | simulated streaming  |
| modified beam search | 640ms      | 9.57   | --epoch 30 --avg 9   | simulated streaming  |
| fast beam search     | 640ms      | 9.67   | --epoch 30 --avg 9   | simulated streaming  |