thennal
/

whisper-medium-ml

Automatic Speech Recognition

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

thennal commited on Jan 1

Commit

1397979

•

1 Parent(s): 4b77f7a

Add usage instructions and colab link

Files changed (1) hide show

README.md +16 -0

README.md CHANGED Viewed

@@ -47,6 +47,22 @@ Note that Whisper's normalization has major issues for languages like Malayalam,
 With normalization (for a fair comparison with other models on this platform), the results are instead:
 - WER: 11.49
 ## Model description
 More information needed

 With normalization (for a fair comparison with other models on this platform), the results are instead:
 - WER: 11.49
+[This Colab](https://colab.research.google.com/github/sanchit-gandhi/notebooks/blob/main/fine_tune_whisper.ipynb) can be used as a starting point to further finetune the model.
+## Usage instructions
+Given an audio sample `audio` (this can be anything from a numpy array to a filepath), the following code generates transcriptions:
+```python
+from transformers import pipeline, WhisperProcessor
+processor = WhisperProcessor.from_pretrained("thennal/whisper-medium-ml")
+forced_decoder_ids = processor.get_decoder_prompt_ids(language="ml", task="transcribe")
+asr = pipeline(
+        "automatic-speech-recognition", model="thennal/whisper-medium-ml", device=0,
+    )
+transcription = asr(audio, chunk_length_s=30, max_new_tokens=448, return_timestamps=False,  generate_kwargs={
+        "forced_decoder_ids": forced_decoder_ids,
+        "do_sample": True,
+    })
+```
 ## Model description
 More information needed