ExHuBERT / README.md
amiriparian's picture
Update README.md
219d8ee verified
|
raw
history blame
2.49 kB
metadata
license: cc-by-nc-sa-4.0
language:
  - en
  - de
  - zh
  - fr
  - nl
  - el
  - it
  - es
  - my
  - he
  - sv
  - fa
  - tr
  - ur
library_name: transformers
pipeline_tag: audio-classification
tags:
  - Speech Emotion Recognition
  - SER
  - Transformer
  - HuBERT
  - PyTorch

ExHuBERT: Enhancing HuBERT Through Block Extension and Fine-Tuning on 37 Emotion Datasets

Authors: Shahin Amiriparian, Filip Packań, Maurice Gerczuk, Björn W. Schuller

Fine-tuned HuBERT Large on EmoSet++, comprising 37 datasets, totaling 150,907 samples and spanning a cumulative duration of 119.5 hours. The model is expecting a 3 second long raw waveform resampled to 16 kHz. The original 6 Ouput classes are combinations of low/high arousal and negative/neutral/positive valence. Further details are available in the corresponding paper

Note: This model is for research purpose only.

EmoSet++ subsets used for fine-tuning the model:

ABC AD BES CASIA CVE
Crema-D DES DEMoS EA-ACT EA-BMW
EA-WSJ EMO-DB EmoFilm EmotiW-2014 EMOVO
eNTERFACE ESD EU-EmoSS EU-EV FAU Aibo
GEMEP GVESS IEMOCAP MES MESD
MELD PPMMK RAVDESS SAVEE ShEMO
SmartKom SIMIS SUSAS SUBSECO TESS
TurkishEmo Urdu

Usage

import torch
import torch.nn as nn
from transformers import HubertForSequenceClassification, Wav2Vec2FeatureExtractor



# CONFIG and MODEL SETUP
model_name = 'amiriparian/HuBERT-EmoSet'
feature_extractor = Wav2Vec2FeatureExtractor.from_pretrained("facebook/hubert-base-ls960")
model = HubertForSequenceClassification.from_pretrained(model_name)
model.classifier = nn.Linear(in_features=256,out_features=6)

sampling_rate=16000 
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)

Citation Info

@inproceedings{Amiriparian24-EEH,
  author = {Shahin Amiriparian and Filip Packan and Maurice Gerczuk and Bj\"orn W.\ Schuller},
  title = {{ExHuBERT: Enhancing HuBERT Through Block Extension and Fine-Tuning on 37 Emotion Datasets}},
  booktitle = {{Proc. INTERSPEECH}}, 
  year = {2024},
  editor = {},
  volume = {},
  series = {},
  address = {Kos Island, Greece},
  month = {September},
  publisher = {ISCA},
}