Shona Text-to-Speech
This repository contains the Shona (sna) language text-to-speech (TTS) model checkpoint.
Model Details
Model Description
- Developed by: Fastino Mateteva
- Model type: Text to Speech
- Language(s) (NLP): Shona
- Finetuned from model: SpeechT5
Usage
pip install --upgrade transformers accelerate
Then, run inference with the following code-snippet:
# Load model directly
from transformers import AutoTokenizer, AutoModelForTextToWaveform
tokenizer = AutoTokenizer.from_pretrained("Fastino06/ff")
model = AutoModelForTextToWaveform.from_pretrained("Fastino06/ff")
text = "some example text in the Shona language"
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
output = model(**inputs).waveform
The resulting waveform can be saved as a .wav
file:
import scipy
scipy.io.wavfile.write("fassy.wav", rate=model.config.sampling_rate, data=output)
Or displayed in a Jupyter Notebook / Google Colab:
from IPython.display import Audio
Audio(output, rate=model.config.sampling_rate)
BibTex citation
This model was developed by Fastino Mateteva
.
- Downloads last month
- 14
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.