This language indendent wav2vec2 classification model is based on (this dataset)[https://github.com/deeplyinc/Nonverbal-Vocalization-Dataset] | |
Sound classes are: | |
- teeth-chattering | |
- teeth-grinding | |
- tongue-clicking | |
- nose-blowing | |
- coughing | |
- yawning | |
- throat clearing | |
- sighing | |
- lip-popping | |
- lip-smacking | |
- panting | |
- crying | |
- laughing | |
- sneezing | |
- moaning | |
- screaming | |
Inference can be seen in *inference.py*. |