Update README.md
Browse files
README.md
CHANGED
@@ -7,7 +7,7 @@ tags:
|
|
7 |
- whisper
|
8 |
- russian
|
9 |
datasets:
|
10 |
-
- mozilla-foundation/
|
11 |
metrics:
|
12 |
- wer
|
13 |
---
|
@@ -16,9 +16,10 @@ metrics:
|
|
16 |
|
17 |
This is a version of [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) finetuned for better support of Russian language.
|
18 |
|
19 |
-
Dataset used for finetuning is Common Voice
|
20 |
|
21 |
-
After preprocessing of the original dataset (
|
|
|
22 |
|
23 |
## Usage
|
24 |
|
|
|
7 |
- whisper
|
8 |
- russian
|
9 |
datasets:
|
10 |
+
- mozilla-foundation/common_voice_17_0
|
11 |
metrics:
|
12 |
- wer
|
13 |
---
|
|
|
16 |
|
17 |
This is a version of [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) finetuned for better support of Russian language.
|
18 |
|
19 |
+
Dataset used for finetuning is Common Voice 17.0, Russian part, that contains over 200k rows.
|
20 |
|
21 |
+
After preprocessing of the original dataset (all splits were mixed and splited to a new train + test split by 0.95/0.05,
|
22 |
+
that is 225761/11883 rows respectively) the original Whisper v3 has WER 9.84 while the finetuned version shows 6.39 (so far).
|
23 |
|
24 |
## Usage
|
25 |
|