sanchit-gandhi
/

distil-whisper-large-v3-de-kd

Automatic Speech Recognition

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

sanchit-gandhi HF staff commited on Dec 22, 2023

Commit

272cbc6

•

1 Parent(s): 610accf

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -130,7 +130,8 @@ accelerate launch --mixed_precision=bf16 run_distillation.py \
   --overwrite_output_dir \
   --predict_with_generate \
   --freeze_encoder \
-  --streaming
 ```
 On a single 80GB A100 GPU, training will take approximately 3.5 days (or 85 hours), and reach a final WER of 6.3%. Tensorboard logs can be found under the tab [Training Metrics](https://huggingface.co/sanchit-gandhi/distil-whisper-large-v3-de-kd/tensorboard?params=scalars#frame).

   --overwrite_output_dir \
   --predict_with_generate \
   --freeze_encoder \
+  --streaming \
+  --push_to_hub
 ```
 On a single 80GB A100 GPU, training will take approximately 3.5 days (or 85 hours), and reach a final WER of 6.3%. Tensorboard logs can be found under the tab [Training Metrics](https://huggingface.co/sanchit-gandhi/distil-whisper-large-v3-de-kd/tensorboard?params=scalars#frame).