metadata

language:
  - ru
license: apache-2.0
tags:
  - sentiment
  - emotion-classification
  - multilabel
  - multiclass
datasets:
  - Djacon/ru_goemotions
metrics:
  - accuracy
widget:
  - text: Очень рад тебя видеть!
  - text: Как дела?
  - text: Мне немного отвратно это делать
  - text: Я испытал мурашки от страха
  - text: Нет ничего радостного в этих горьких новостях
  - text: Ого, неожидал тебя здесь увидеть!
  - text: Фу ну и мерзость
  - text: Мне неприятно общение с тобой
base_model: ai-forever/ruBert-base
model-index:
  - name: ruBert-base-russian-emotions-classifier-goEmotions
    results:
      - task:
          type: multilabel-text-classification
          name: Multilabel Text Classification
        dataset:
          name: ru_goemotions
          type: Djacon/ru_goemotions
          args: ru
        metrics:
          - type: roc_auc
            value: 92%
            name: multilabel ROC AUC

ruBert-base-russian-emotions-classifier-goEmotions

This model is a fine-tuned version of ai-forever/ruBert-base on Djacon/ru_goemotions. It achieves the following results on the evaluation set (2nd epoch):

Loss: 0.2088
AUC: 0.9240

The quality of the predicted probabilities on the test dataset is the following:

label	joy	interest	surpise	sadness	anger	disgust	fear	guilt	neutral	average
AUC	0.9369	0.9213	0.9325	0.8791	0.8374	0.9041	0.9470	0.9758	0.8518	0.9095
F1-micro	0.9528	0.9157	0.9697	0.9284	0.8690	0.9658	0.9851	0.9875	0.7654	0.9266
F1-macro	0.8369	0.7922	0.7561	0.7392	0.7351	0.7356	0.8176	0.8247	0.7650	0.7781

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 3

Training results

Training Loss	Epoch	Step	Validation Loss	AUC
0.1755	1.0	1685	0.1717	0.9220
0.1391	2.0	3370	0.1757	0.9240
0.0899	3.0	5055	0.2088	0.9106

Framework versions

Transformers 4.24.0
Pytorch 2.0.1
Datasets 2.12.0
Tokenizers 0.11.0