amiriparian commited on
Commit
2738a8c
1 Parent(s): d27d836

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +80 -0
README.md CHANGED
@@ -1,3 +1,83 @@
1
  ---
2
  license: cc-by-nc-sa-4.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-nc-sa-4.0
3
+ language:
4
+ - en
5
+ - de
6
+ - zh
7
+ - fr
8
+ - nl
9
+ - el
10
+ - it
11
+ library_name: transformers
12
+ pipeline_tag: audio-classification
13
+ tags:
14
+ - HuBERT
15
+ - Speech Emotion Recognition
16
+ - SER
17
+ - PyTorch
18
  ---
19
+
20
+ # **ExHuBERT: Enhancing HuBERT Through Block Extension and Fine-Tuning on 37 Emotion Datasets**
21
+ Authors: Shahin Amiriparian, Filip Packań, Maurice Gerczuk, Björn W. Schuller
22
+
23
+ Fine-tuned [**HuBERT Large**](https://huggingface.co/facebook/hubert-large-ls960-ft) on EmoSet++, comprising 37 datasets, totaling 150,907 samples and spanning a cumulative duration of 119.5 hours.
24
+ The model is expecting a 3 second long raw waveform resampled to 16 kHz. The original 6 Ouput classes are combinations of low/high arousal and negative/neutral/positive
25
+ valence.
26
+ Further details are available in the corresponding [**paper**](https://arxiv.org/)
27
+
28
+ **Note**: This model is for research purpose only.
29
+
30
+ ### EmoSet++ subsets used for fine-tuning the model:
31
+
32
+ | | | | | |
33
+ | :---: | :---: | :---: | :---: | :---: |
34
+ | ABC | AD | BES | CASIA | CVE |
35
+ | Crema-D | DES | DEMoS | EA-ACT | EA-BMW |
36
+ | EA-WSJ | EMO-DB | EmoFilm | EmotiW-2014 | EMOVO |
37
+ | eNTERFACE | ESD | EU-EmoSS | EU-EV | FAU Aibo |
38
+ | GEMEP | GVESS | IEMOCAP | MES | MESD |
39
+ | MELD | PPMMK | RAVDESS | SAVEE | ShEMO |
40
+ | SmartKom | SIMIS | SUSAS | SUBSECO | TESS |
41
+ | TurkishEmo | Urdu | | | |
42
+
43
+
44
+
45
+ ### Usage
46
+
47
+ ```python
48
+ import torch
49
+ import torch.nn as nn
50
+ from transformers import HubertForSequenceClassification, Wav2Vec2FeatureExtractor
51
+
52
+
53
+
54
+ # CONFIG and MODEL SETUP
55
+ model_name = '.../HuBERT-EmoSet++'
56
+ feature_extractor = Wav2Vec2FeatureExtractor.from_pretrained("facebook/hubert-base-ls960")
57
+ model = HubertForSequenceClassification.from_pretrained(model_name)
58
+ model.classifier = nn.Linear(in_features=256,out_features=6)
59
+
60
+ sampling_rate=16000
61
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
62
+ model = model.to(device)
63
+
64
+
65
+ ```
66
+
67
+ ### Citation Info
68
+
69
+
70
+ ```
71
+ @inproceedings{Amiriparian24-EEH,
72
+ author = {Shahin Amiriparian and Filip Packan and Maurice Gerczuk and Bj\"orn W.\ Schuller},
73
+ title = {{ExHuBERT: Enhancing HuBERT Through Block Extension and Fine-Tuning on 37 Emotion Datasets}},
74
+ booktitle = {{Proc. INTERSPEECH}},
75
+ year = {2024},
76
+ editor = {},
77
+ volume = {},
78
+ series = {},
79
+ address = {Kos Island, Greece},
80
+ month = {September},
81
+ publisher = {ISCA},
82
+ }
83
+ ```