Commit
•
272cbc6
1
Parent(s):
610accf
Update README.md
Browse files
README.md
CHANGED
@@ -130,7 +130,8 @@ accelerate launch --mixed_precision=bf16 run_distillation.py \
|
|
130 |
--overwrite_output_dir \
|
131 |
--predict_with_generate \
|
132 |
--freeze_encoder \
|
133 |
-
--streaming
|
|
|
134 |
```
|
135 |
|
136 |
On a single 80GB A100 GPU, training will take approximately 3.5 days (or 85 hours), and reach a final WER of 6.3%. Tensorboard logs can be found under the tab [Training Metrics](https://huggingface.co/sanchit-gandhi/distil-whisper-large-v3-de-kd/tensorboard?params=scalars#frame).
|
|
|
130 |
--overwrite_output_dir \
|
131 |
--predict_with_generate \
|
132 |
--freeze_encoder \
|
133 |
+
--streaming \
|
134 |
+
--push_to_hub
|
135 |
```
|
136 |
|
137 |
On a single 80GB A100 GPU, training will take approximately 3.5 days (or 85 hours), and reach a final WER of 6.3%. Tensorboard logs can be found under the tab [Training Metrics](https://huggingface.co/sanchit-gandhi/distil-whisper-large-v3-de-kd/tensorboard?params=scalars#frame).
|