Edit model card

# Model Card: DistilBERT with LoRA for Text Classification

Model Details

Model Name: DistilBERT with LoRA for Text Classification
Model Type: Transformer-based Language Model
Base Model: distilbert-base-multilingual-cased
Fine-tuning Framework: LoRA (Low-Rank Adaptation of Large Language Models)
Trained By: ABODO Brice Donald
License: Apache 2.0

This model is a fine-tuned version of distilbert-base-multilingual-cased on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0019
  • Accuracy: 0.8276
  • F1: 0.8284
  • Precision: 0.8317
  • Recall: 0.8276

Model description

This model is a fine-tuned version of distilbert-base-multilingual-cased for text classification tasks. The model has been adapted using LoRA (Low-Rank Adaptation) to efficiently train on the target dataset with fewer parameters, allowing for better performance with less computational resources.

Intended uses & limitations

The model was trained and evaluated on the Russian Language news dataset, which consists of news texts labeled as positive, negative or neutral. The dataset is divided into training and test sets for evaluation purposes.

Intended Use

This model is intended for text classification tasks, particularly multilabel sentiment analysis. It can be fine-tuned further for other classification tasks by using appropriate datasets and modifying the number of labels.

Limitations and Risks

  • Bias: The model may inherit biases present in the training data.
  • Generalization: Performance may vary on datasets with different distributions from the training data.
  • Resource Usage: Although more efficient than larger models, fine-tuning and inference still require significant computational resources.

Training and evaluation data

The model was evaluated using the following metrics:

  • Accuracy: Measures the fraction of correct predictions.
  • F1 Score: Harmonic mean of precision and recall.
  • Precision: Proportion of positive identifications that are actually correct.
  • Recall: Proportion of actual positives that are correctly identified.

Training procedure

Preprocessing

  • Tokenization: The text data was tokenized using the DistilBertTokenizer with a maximum length of 512 tokens.
  • Padding and Truncation: Applied to ensure uniform input size.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0009143508688456378
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 7

Training results

Training Loss Epoch Step Validation Loss Accuracy F1 Precision Recall
No log 1.0 91 0.5987 0.7634 0.7621 0.7648 0.7634
No log 2.0 182 0.3768 0.8693 0.8698 0.8767 0.8693
No log 3.0 273 0.2620 0.9065 0.9063 0.9093 0.9065
No log 4.0 364 0.2427 0.9202 0.9203 0.9220 0.9202
No log 5.0 455 0.2244 0.9367 0.9369 0.9387 0.9367
0.3641 6.0 546 0.2385 0.9491 0.9491 0.9495 0.9491
0.3641 7.0 637 0.2560 0.9464 0.9464 0.9465 0.9464

How to Use

from transformers import DistilBertTokenizer, DistilBertForSequenceClassification, Trainer, TrainingArguments
from peft import PeftConfig, PeftModel

# Load the tokenizer and model
tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-multilingual-cased')
model_id = 'pyteach237/multilabel_lora_distilbert_runews_classifier_tuned'
config = PeftConfig.from_pretrained(model_id)

# Define the model with LoRA
model = DistilBertForSequenceClassification.from_pretrained(
    config.base_model_name_or_path,
    num_labels=3
)
model = PeftModel.from_pretrained(model, model_id, config=config)

text = "Your text here :)"

# Tokenize input
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding='max_length', max_length=512)

# Make predictions
outputs = model(**inputs)
predictions = outputs.logits.argmax(dim=-1)

# Convert predictions to labels
labels = ['negative', 'neutral', 'positive']
predicted_label = labels[predictions.item()]
print(f'Predicted label: {predicted_label}')

Acknowledgements

This model card template was inspired by the Hugging Face model cards. Special thanks to the contributors of the Hugging Face transformers library and the LoRA adaptation framework.

Contact Information

For further information, please contact [Brice Donald] at [[email protected]].

Framework versions

  • PEFT 0.11.1
  • Transformers 4.41.2
  • Pytorch 2.1.2
  • Datasets 2.19.2
  • Tokenizers 0.19.1
Downloads last month
3
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for pyteach237/multilabel_lora_distilbert_runews_classifier_tuned

Adapter
(2)
this model