vits / DATA_EN.MD
zhuowen999's picture
yutou
933d305
The pipeline of this repo supports multiple voice uploading options,you can choose one or more options depending on the data you have.
1. Short audios packed by a single `.zip` file, whose file structure should be as shown below:
```
Your-zip-file.zip
β”œβ”€β”€β”€Character_name_1
β”œ β”œβ”€β”€β”€xxx.wav
β”œ β”œβ”€β”€β”€...
β”œ β”œβ”€β”€β”€yyy.mp3
β”œ └───zzz.wav
β”œβ”€β”€β”€Character_name_2
β”œ β”œβ”€β”€β”€xxx.wav
β”œ β”œβ”€β”€β”€...
β”œ β”œβ”€β”€β”€yyy.mp3
β”œ └───zzz.wav
β”œβ”€β”€β”€...
β”œ
└───Character_name_n
β”œβ”€β”€β”€xxx.wav
β”œβ”€β”€β”€...
β”œβ”€β”€β”€yyy.mp3
└───zzz.wav
```
Note that the format of the audio files does not matter as long as they are audio files。
Quality requirement: >=2s, <=10s, contain as little background sound as possible.
Quantity requirement: at least 10 per character, 20+ per character is recommended.
2. Long audio files named by character names, which should contain single character voice only. Background sound is
acceptable since they will be automatically removed. File name format `{CharacterName}_{random_number}.wav`
(E.G. `Diana_234135.wav`, `MinatoAqua_234252.wav`), must be `.wav` files.
3. Long video files named by character names, which should contain single character voice only. Background sound is
acceptable since they will be automatically removed. File name format `{CharacterName}_{random_number}.mp4`
(E.G. `Taffy_332452.mp4`, `Dingzhen_957315.mp4`), must be `.mp4` files.
Note: `CharacterName` must be English characters only, `random_number` is to identify multiple files for one character,
which is compulsory to add. It could be a random integer between 0~999999.
4. A `.txt` containing multiple lines of`{CharacterName}|{video_url}`, which should be formatted as follows:
```
Char1|https://xyz.com/video1/
Char2|https://xyz.com/video2/
Char2|https://xyz.com/video3/
Char3|https://xyz.com/video4/
```
One video should contain single speaker only. Currently supports videos links from bilibili, other websites are yet to be tested.
Having questions regarding to data format? Fine data samples of all format from [here](https://drive.google.com/file/d/132l97zjanpoPY4daLgqXoM7HKXPRbS84/view?usp=sharing).