Wav2Vec2_Fine_tuned_on_RAVDESS_2_Speech_Emotion_Recognition
This model is a fine-tuned version of jonatasgrosman/wav2vec2-large-xlsr-53-english.
The dataset used to fine-tune the original pre-trained model is the RAVDESS dataset. This dataset provides 7442 samples of recordings from actors performing on 6 different emotions in English, which are:
emotions = ['angry', 'calm', 'disgust', 'fearful', 'happy', 'neutral', 'sad', 'surprised']
It achieves the following results on the evaluation set:
- Loss: 0.5638
- Accuracy: 0.8125
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 8
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3.0
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
2.1085 | 0.0694 | 10 | 2.0715 | 0.1701 |
2.043 | 0.1389 | 20 | 2.0531 | 0.1944 |
2.0038 | 0.2083 | 30 | 1.9162 | 0.3056 |
1.9217 | 0.2778 | 40 | 1.8085 | 0.3264 |
1.7814 | 0.3472 | 50 | 1.6440 | 0.3611 |
1.5997 | 0.4167 | 60 | 1.5428 | 0.3681 |
1.5293 | 0.4861 | 70 | 1.4812 | 0.4062 |
1.5473 | 0.5556 | 80 | 1.3423 | 0.4826 |
1.5098 | 0.625 | 90 | 1.3632 | 0.4653 |
1.1967 | 0.6944 | 100 | 1.3762 | 0.4618 |
1.2255 | 0.7639 | 110 | 1.3456 | 0.4618 |
1.6152 | 0.8333 | 120 | 1.3206 | 0.4826 |
1.1365 | 0.9028 | 130 | 1.3343 | 0.4792 |
1.1254 | 0.9722 | 140 | 1.2481 | 0.4792 |
1.3486 | 1.0417 | 150 | 1.4024 | 0.4688 |
1.2029 | 1.1111 | 160 | 1.1053 | 0.5556 |
1.0734 | 1.1806 | 170 | 1.1238 | 0.6181 |
1.029 | 1.25 | 180 | 1.3111 | 0.5347 |
1.0955 | 1.3194 | 190 | 1.0256 | 0.6146 |
0.8893 | 1.3889 | 200 | 0.9970 | 0.6389 |
0.8874 | 1.4583 | 210 | 0.9895 | 0.6389 |
0.9227 | 1.5278 | 220 | 0.8335 | 0.6667 |
0.7566 | 1.5972 | 230 | 0.8839 | 0.6944 |
0.8062 | 1.6667 | 240 | 0.8070 | 0.7118 |
0.6773 | 1.7361 | 250 | 0.7592 | 0.7222 |
0.7874 | 1.8056 | 260 | 1.1098 | 0.6285 |
0.8262 | 1.875 | 270 | 0.6952 | 0.7569 |
0.568 | 1.9444 | 280 | 0.7635 | 0.7326 |
0.6914 | 2.0139 | 290 | 0.6607 | 0.7917 |
0.6838 | 2.0833 | 300 | 0.8466 | 0.7049 |
0.6318 | 2.1528 | 310 | 0.6612 | 0.8056 |
0.604 | 2.2222 | 320 | 0.9257 | 0.6667 |
0.5321 | 2.2917 | 330 | 0.6067 | 0.7986 |
0.3421 | 2.3611 | 340 | 0.6594 | 0.7535 |
0.3536 | 2.4306 | 350 | 0.6525 | 0.7812 |
0.3087 | 2.5 | 360 | 0.6412 | 0.7812 |
0.4236 | 2.5694 | 370 | 0.6560 | 0.7812 |
0.5134 | 2.6389 | 380 | 0.6614 | 0.7882 |
0.5709 | 2.7083 | 390 | 0.5989 | 0.8021 |
0.2912 | 2.7778 | 400 | 0.6142 | 0.7951 |
0.516 | 2.8472 | 410 | 0.5926 | 0.7986 |
0.3835 | 2.9167 | 420 | 0.5797 | 0.8125 |
0.4055 | 2.9861 | 430 | 0.5638 | 0.8125 |
Framework versions
- Transformers 4.41.0.dev0
- Pytorch 2.2.1+cu121
- Datasets 2.19.1.dev0
- Tokenizers 0.19.1
- Downloads last month
- 12