jimregan
/

wav2vec2-large-xlsr-upper-sorbian-mixed

Automatic Speech Recognition

xlsr-fine-tuning-week

Inference Endpoints

Model card Files Files and versions Community

jimregan commited on Mar 28, 2021

Commit

c89db03

•

1 Parent(s): 0d51599

Update README.md

Files changed (1) hide show

README.md +4 -2

README.md CHANGED Viewed

@@ -27,7 +27,9 @@ model-index:
 # Wav2Vec2-Large-XLSR-Upper-Sorbian
 Fine-tuned [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53)
-on the [Upper Sorbian Common Voice dataset](https://huggingface.co/datasets/common_voice).
 When using this model, make sure that your speech input is sampled at 16kHz.
 ## Usage
@@ -75,7 +77,7 @@ wer = load_metric("wer")
 processor = Wav2Vec2Processor.from_pretrained("jimregan/wav2vec2-large-xlsr-upper-sorbian-mixed")
 model = Wav2Vec2ForCTC.from_pretrained("jimregan/wav2vec2-large-xlsr-upper-sorbian-mixed")
 model.to("cuda")
-chars_to_ignore_regex = '[\\\\,\\\\?\\\\.\\\\!\\\\-\\\\;\\\\:\\\\"\\\\“\\\\%\\\\‘\\\\”\\\\�„«»–]'
 resampler = torchaudio.transforms.Resample(48_000, 16_000)
 # Preprocessing the datasets.
 # We need to read the audio files as arrays

 # Wav2Vec2-Large-XLSR-Upper-Sorbian
 Fine-tuned [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53)
+on the [Upper Sorbian Common Voice dataset](https://huggingface.co/datasets/common_voice), with an
+extra 28 minutes of audio from an online [Sorbian course](https://sprachkurs.sorbischlernen.de/).
 When using this model, make sure that your speech input is sampled at 16kHz.
 ## Usage
 processor = Wav2Vec2Processor.from_pretrained("jimregan/wav2vec2-large-xlsr-upper-sorbian-mixed")
 model = Wav2Vec2ForCTC.from_pretrained("jimregan/wav2vec2-large-xlsr-upper-sorbian-mixed")
 model.to("cuda")
+chars_to_ignore_regex = '[\\\\\\\\,\\\\\\\\?\\\\\\\\.\\\\\\\\!\\\\\\\\-\\\\\\\\;\\\\\\\\:\\\\\\\\"\\\\\\\\“\\\\\\\\%\\\\\\\\‘\\\\\\\\”\\\\\\\\�„«»–]'
 resampler = torchaudio.transforms.Resample(48_000, 16_000)
 # Preprocessing the datasets.
 # We need to read the audio files as arrays