Volodymyr Kyrylov commited on
Commit
0433008
1 Parent(s): 1c65e37

initial import

Browse files
README.md ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - uk
4
+ tags:
5
+ - automatic-speech-recognition
6
+ - audio
7
+ license: cc-by-nc-sa-4.0
8
+ datasets:
9
+ - https://github.com/egorsmkv/speech-recognition-uk
10
+ - mozilla-foundation/common_voice_10_0
11
+ metrics:
12
+ - wer
13
+ model-index:
14
+ - name: Ukrainian causal pruned_transducer_stateless5 v1.0.0
15
+ results:
16
+ - task:
17
+ name: Speech Recognition
18
+ type: automatic-speech-recognition
19
+ dataset:
20
+ name: Common Voice uk
21
+ type: mozilla-foundation/common_voice_10_0
22
+ split: validation
23
+ args: uk
24
+ metrics:
25
+ - name: Validation WER
26
+ type: wer
27
+ value: 17.26
28
+ ---
29
+
30
+ Online variant of `pruned_transducer_stateless5` for Ukrainian: https://github.com/proger/icefall/tree/uk
31
+
32
+ Decoding demo using [Sherpa](https://k2-fsa.github.io/sherpa/): [https://twitter.com/darkproger/status/1570733844114046976](https://twitter.com/darkproger/status/1570733844114046976)
33
+
34
+ Trained on pseudolabels generated by [darkproger/pruned-transducer-stateless5-ukrainian-1](https://huggingface.co/darkproger/pruned-transducer-stateless5-ukrainian-1) on the training dataset.
35
+
36
+ [Tensorboard run](https://tensorboard.dev/experiment/uMmMmZvwS2euyCrj7BlPOQ/)
37
+
38
+
39
+ ```
40
+ ./pruned_transducer_stateless5/train.py \
41
+ --world-size 2 \
42
+ --num-epochs 31 \
43
+ --start-epoch 1 \
44
+ --full-libri 1 \
45
+ --exp-dir pruned_transducer_stateless5/exp-uk-filtered2 \
46
+ --max-duration 600 \
47
+ --use-fp16 1 \
48
+ --num-encoder-layers 18 \
49
+ --dim-feedforward 1024 \
50
+ --nhead 4 \
51
+ --encoder-dim 256 \
52
+ --decoder-dim 512 \
53
+ --joiner-dim 512 \
54
+ --bpe-model uk/data/lang_bpe_250/bpe.model \
55
+ --causal-convolution True \
56
+ --dynamic-chunk-training True
57
+ ```
data/lang_bpe_250/L.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fca162b9c0a7d2263b3be95607063f6ae70ee357b6d7493d8f49a7ea32fbb484
3
+ size 11433831
data/lang_bpe_250/bpe.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f1a465ab230e64d5f2a5cd07ac446f4ce0288925c6019683ee889396094bddfb
3
+ size 241481
data/lang_bpe_250/tokens.txt ADDED
@@ -0,0 +1,252 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <blk> 0
2
+ <sos/eos> 1
3
+ <unk> 2
4
+ ▁ 3
5
+ в 4
6
+ н 5
7
+ ти 6
8
+ т 7
9
+ й 8
10
+ е 9
11
+ с 10
12
+ м 11
13
+ ▁на 12
14
+ к 13
15
+ р 14
16
+ у 15
17
+ д 16
18
+ ▁з 17
19
+ ▁с 18
20
+ о 19
21
+ ▁в 20
22
+ ▁по 21
23
+ х 22
24
+ ли 23
25
+ ва 24
26
+ но 25
27
+ ро 26
28
+ я 27
29
+ і 28
30
+ ні 29
31
+ з 30
32
+ а 31
33
+ и 32
34
+ ю 33
35
+ ра 34
36
+ ка 35
37
+ ▁за 36
38
+ ▁не 37
39
+ є 38
40
+ ▁і 39
41
+ ла 40
42
+ ▁що 41
43
+ на 42
44
+ ки 43
45
+ ▁до 44
46
+ ї 45
47
+ ш 46
48
+ ж 47
49
+ ві 48
50
+ ч 49
51
+ во 50
52
+ сь 51
53
+ ко 52
54
+ ни 53
55
+ ло 54
56
+ ри 55
57
+ лі 56
58
+ ль 57
59
+ ви 58
60
+ да 59
61
+ ▁у 60
62
+ б 61
63
+ ку 62
64
+ ▁про 63
65
+ ці 64
66
+ ▁ви 65
67
+ ▁а 66
68
+ ▁це 67
69
+ ди 68
70
+ ▁о 69
71
+ то 70
72
+ рі 71
73
+ п 72
74
+ ся 73
75
+ мо 74
76
+ ми 75
77
+ ть 76
78
+ ма 77
79
+ ц 78
80
+ г 79
81
+ ну 80
82
+ ▁як 81
83
+ ▁я 82
84
+ ого 83
85
+ ▁від 84
86
+ ре 85
87
+ ▁та 86
88
+ ме 87
89
+ та 88
90
+ ▁п 89
91
+ ті 90
92
+ ле 91
93
+ те 92
94
+ ру 93
95
+ чи 94
96
+ га 95
97
+ ▁ко 96
98
+ ст 97
99
+ ту 98
100
+ кі 99
101
+ по 100
102
+ ▁ма 101
103
+ ' 102
104
+ ▁ми 103
105
+ ді 104
106
+ ▁так 105
107
+ сі 106
108
+ мі 107
109
+ бу 108
110
+ ля 109
111
+ ▁мі 110
112
+ не 111
113
+ ▁при 112
114
+ ння 113
115
+ ▁мо 114
116
+ же 115
117
+ ду 116
118
+ щ 117
119
+ ча 118
120
+ ▁де 119
121
+ до 120
122
+ ому 121
123
+ ▁г 122
124
+ ▁к 123
125
+ нь 124
126
+ би 125
127
+ сто 126
128
+ ▁д 127
129
+ бі 128
130
+ ▁го 129
131
+ ь 130
132
+ ▁то 131
133
+ ▁те 132
134
+ лю 133
135
+ ють 134
136
+ че 135
137
+ го 136
138
+ де 137
139
+ бо 138
140
+ си 139
141
+ за 140
142
+ ер 141
143
+ них 142
144
+ ▁але 143
145
+ ста 144
146
+ ▁роз 145
147
+ хо 146
148
+ пи 147
149
+ пі 148
150
+ ▁він 149
151
+ ний 150
152
+ му 151
153
+ ▁для 152
154
+ пе 153
155
+ ф 154
156
+ ши 155
157
+ ▁б 156
158
+ ▁ш 157
159
+ л 158
160
+ ▁україн 159
161
+ ▁під 160
162
+ ▁пере 161
163
+ ▁од 162
164
+ ше 163
165
+ ня 164
166
+ со 165
167
+ па 166
168
+ жи 167
169
+ ▁па 168
170
+ ▁ба 169
171
+ ▁ка 170
172
+ ▁зна 171
173
+ ять 172
174
+ ▁ф 173
175
+ рів 174
176
+ ▁час 175
177
+ ▁ре 176
178
+ ного 177
179
+ ▁ста 178
180
+ лу 179
181
+ ▁його 180
182
+ ▁ні 181
183
+ ▁тому 182
184
+ ба 183
185
+ ▁сам 184
186
+ ▁буде 185
187
+ сті 186
188
+ ця 187
189
+ ▁вони 188
190
+ ▁дуже 189
191
+ ▁пра 190
192
+ ха 191
193
+ ▁нас 192
194
+ ▁хо 193
195
+ ться 194
196
+ ість 195
197
+ ▁со 196
198
+ ▁чи 197
199
+ ▁ді 198
200
+ ▁коли 199
201
+ жу 200
202
+ ▁об 201
203
+ ▁бо 202
204
+ чу 203
205
+ ▁які 204
206
+ вер 205
207
+ ▁якщо 206
208
+ ▁три 207
209
+ ▁вже 208
210
+ чі 209
211
+ жа 210
212
+ ▁все 211
213
+ ▁було 212
214
+ ▁може 213
215
+ ▁буд 214
216
+ ▁вона 215
217
+ ▁два 216
218
+ гу 217
219
+ ▁тут 218
220
+ гі 219
221
+ увати 220
222
+ ення 221
223
+ ▁роб 222
224
+ ▁зараз 223
225
+ ▁того 224
226
+ ▁більш 225
227
+ ▁тисяч 226
228
+ ▁один 227
229
+ ▁перш 228
230
+ ▁можна 229
231
+ ▁люди 230
232
+ ▁цього 231
233
+ ▁їх 232
234
+ ▁село 233
235
+ ▁мене 234
236
+ ▁раз 235
237
+ ▁двадцять 236
238
+ ▁треба 237
239
+ аємо 238
240
+ ▁навіть 239
241
+ ▁рад 240
242
+ ▁був 241
243
+ ▁сьогодні 242
244
+ ▁без 243
245
+ ▁тільки 244
246
+ ▁провулок 245
247
+ ▁сім 246
248
+ ається 247
249
+ ▁свої 248
250
+ ґ 249
251
+ #0 250
252
+ #1 251
data/lang_bpe_250/words.txt ADDED
The diff for this file is too large to render. See raw diff
 
exp/cpu_jit.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1b5a5ab6a26d88658922b6bcac970794547da8cc7840a11e31da156b0cf91288
3
+ size 130759206
exp/pretrained.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:616abfdd0656aad03fa79bd5cf7f65cc92333737da49e7b571f8bcb0371e2896
3
+ size 120505891
log/errs-test-other-beam_20.0_max_contexts_8_max_states_64_num_paths_200_nbest_scale_0.5_ngram_lm_scale_0.01-epoch-31-avg-12-beam-20.0-max-contexts-8-max-states-64-nbest-scale-0.5-num-paths-200-ngram-lm-scale-0.01-use-averaged-model.txt ADDED
The diff for this file is too large to render. See raw diff
 
log/log-decode-epoch-31-avg-12-beam-20.0-max-contexts-8-max-states-64-nbest-scale-0.5-num-paths-200-ngram-lm-scale-0.01-use-averaged-model-2022-09-19-22-54-24 ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2022-09-19 22:54:24,045 INFO [decode.py:698] Decoding started
2
+ 2022-09-19 22:54:24,045 INFO [decode.py:704] Device: cpu
3
+ 2022-09-19 22:54:24,050 INFO [decode.py:719] {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 50, 'reset_interval': 200, 'valid_interval': 3000, 'feature_dim': 80, 'subsampling_factor': 4, 'model_warm_step': 3000, 'env_info': {'k2-version': '1.19', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '125d34703f898b5ca54f6f4a925f2bc2d7a5ba98', 'k2-git-date': 'Wed Aug 31 04:50:54 2022', 'lhotse-version': '1.6.0', 'torch-version': '1.12.1+cu113', 'torch-cuda-available': False, 'torch-cuda-version': '11.3', 'python-version': '3.8', 'icefall-git-branch': 'uk', 'icefall-git-sha1': '42c4476-dirty', 'icefall-git-date': 'Thu Sep 15 16:29:29 2022', 'icefall-path': '/home/proger/icefall', 'k2-path': '/home/proger/.local/lib/python3.8/site-packages/k2-1.19.dev20220916+cuda11.3.torch1.12.1-py3.8-linux-x86_64.egg/k2/__init__.py', 'lhotse-path': '/home/proger/.local/lib/python3.8/site-packages/lhotse/__init__.py', 'hostname': 'rt', 'IP address': '127.0.1.1'}, 'epoch': 31, 'iter': 0, 'avg': 12, 'use_averaged_model': True, 'exp_dir': PosixPath('pruned_transducer_stateless5/exp-uk-filtered2'), 'bpe_model': 'uk/data/lang_bpe_250/bpe.model', 'lang_dir': PosixPath('uk/data/lang_bpe_250'), 'decoding_method': 'fast_beam_search_nbest_LG', 'beam_size': 4, 'beam': 20.0, 'ngram_lm_scale': 0.01, 'max_contexts': 8, 'max_states': 64, 'context_size': 2, 'max_sym_per_frame': 1, 'num_paths': 200, 'nbest_scale': 0.5, 'simulate_streaming': False, 'decode_chunk_size': 16, 'left_context': 64, 'num_encoder_layers': 18, 'dim_feedforward': 1024, 'nhead': 4, 'encoder_dim': 256, 'decoder_dim': 512, 'joiner_dim': 512, 'dynamic_chunk_training': True, 'causal_convolution': True, 'short_chunk_size': 25, 'num_left_chunks': 4, 'full_libri': True, 'manifest_dir': PosixPath('uk/data/fbank'), 'max_duration': 200.0, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'drop_last': True, 'return_cuts': True, 'num_workers': 2, 'enable_spec_aug': True, 'spec_aug_time_warp_factor': 80, 'enable_musan': True, 'input_strategy': 'PrecomputedFeatures', 'res_dir': PosixPath('pruned_transducer_stateless5/exp-uk-filtered2/fast_beam_search_nbest_LG'), 'suffix': 'epoch-31-avg-12-beam-20.0-max-contexts-8-max-states-64-nbest-scale-0.5-num-paths-200-ngram-lm-scale-0.01-use-averaged-model', 'blank_id': 0, 'unk_id': 2, 'vocab_size': 250}
4
+ 2022-09-19 22:54:24,050 INFO [decode.py:721] About to create model
5
+ 2022-09-19 22:54:24,162 INFO [decode.py:788] Calculating the averaged model over epoch range from 19 (excluded) to 31
6
+ 2022-09-19 22:54:25,982 WARNING [decode.py:816] No uk/data/lang_bpe_250/LG.pt - using a trivial graph without a word table
7
+ 2022-09-19 22:54:25,989 INFO [decode.py:832] Number of model parameters: 30053246
8
+ 2022-09-19 22:54:25,989 INFO [asr_datamodule_uk.py:422] About to get ('train-other-shuffled-filtered2',) cuts
9
+ 2022-09-19 22:54:26,653 INFO [asr_datamodule_uk.py:441] About to get test-other cuts
10
+ 2022-09-19 22:54:43,726 INFO [decode.py:596] batch 0/?, cuts processed until now is 29
11
+ 2022-09-19 22:59:48,659 INFO [decode.py:596] batch 20/?, cuts processed until now is 659
12
+ 2022-09-19 23:04:53,039 INFO [decode.py:596] batch 40/?, cuts processed until now is 1323
13
+ 2022-09-19 23:09:53,214 INFO [decode.py:596] batch 60/?, cuts processed until now is 2057
14
+ 2022-09-19 23:14:57,627 INFO [decode.py:596] batch 80/?, cuts processed until now is 2763
15
+ 2022-09-19 23:17:57,558 INFO [decode.py:614] The transcripts are stored in pruned_transducer_stateless5/exp-uk-filtered2/fast_beam_search_nbest_LG/recogs-test-other-beam_20.0_max_contexts_8_max_states_64_num_paths_200_nbest_scale_0.5_ngram_lm_scale_0.01-epoch-31-avg-12-beam-20.0-max-contexts-8-max-states-64-nbest-scale-0.5-num-paths-200-ngram-lm-scale-0.01-use-averaged-model.txt
16
+ 2022-09-19 23:17:57,589 INFO [utils.py:428] [test-other-beam_20.0_max_contexts_8_max_states_64_num_paths_200_nbest_scale_0.5_ngram_lm_scale_0.01] %WER 17.26% [4188 / 24269, 513 ins, 578 del, 3097 sub ]
17
+ 2022-09-19 23:17:57,668 INFO [decode.py:627] Wrote detailed error stats to pruned_transducer_stateless5/exp-uk-filtered2/fast_beam_search_nbest_LG/errs-test-other-beam_20.0_max_contexts_8_max_states_64_num_paths_200_nbest_scale_0.5_ngram_lm_scale_0.01-epoch-31-avg-12-beam-20.0-max-contexts-8-max-states-64-nbest-scale-0.5-num-paths-200-ngram-lm-scale-0.01-use-averaged-model.txt
18
+ 2022-09-19 23:17:57,669 INFO [decode.py:644]
19
+ For test-other, WER of different settings are:
20
+ beam_20.0_max_contexts_8_max_states_64_num_paths_200_nbest_scale_0.5_ngram_lm_scale_0.01 17.26 best for test-other
21
+
22
+ 2022-09-19 23:17:57,669 INFO [decode.py:887] Done!
log/log-train-2022-09-17-00-39-59-0 ADDED
The diff for this file is too large to render. See raw diff
 
log/log-train-2022-09-17-00-39-59-1 ADDED
The diff for this file is too large to render. See raw diff
 
log/log-train-2022-09-19-08-50-42-0 ADDED
The diff for this file is too large to render. See raw diff
 
log/log-train-2022-09-19-08-50-42-1 ADDED
The diff for this file is too large to render. See raw diff
 
log/recogs-test-other-beam_20.0_max_contexts_8_max_states_64_num_paths_200_nbest_scale_0.5_ngram_lm_scale_0.01-epoch-31-avg-12-beam-20.0-max-contexts-8-max-states-64-nbest-scale-0.5-num-paths-200-ngram-lm-scale-0.01-use-averaged-model.txt ADDED
The diff for this file is too large to render. See raw diff
 
log/wer-summary-test-other-beam_20.0_max_contexts_8_max_states_64_num_paths_200_nbest_scale_0.5_ngram_lm_scale_0.01-epoch-31-avg-12-beam-20.0-max-contexts-8-max-states-64-nbest-scale-0.5-num-paths-200-ngram-lm-scale-0.01-use-averaged-model.txt ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ settings WER
2
+ beam_20.0_max_contexts_8_max_states_64_num_paths_200_nbest_scale_0.5_ngram_lm_scale_0.01 17.26