--- license: mit base_model: xlm-roberta-large tags: - generated_from_trainer metrics: - accuracy - f1 model-index: - name: XLM_RoBERTa-Multilingual-Clickbait-Detection results: [] datasets: - christinacdl/clickbait_detection_dataset language: - en - el - it - es - ro - de - fr - pl pipeline_tag: text-classification --- # XLM_RoBERTa-Multilingual-Clickbait-Detection This model is a fine-tuned version of [xlm-roberta-large](https://huggingface.co/xlm-roberta-large) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 0.2192 - Micro F1: 0.9759 - Macro F1: 0.9758 - Accuracy: 0.9759 ## Test Set Macro-F1 scores - Multilingual test set: 97.28 - en test set: 97.83 - el test set: 97.32 - it test set: 97.54 - es test set: 97.67 - ro test set: 97.40 - de test set: 97.40 - fr test set: 96.90 - pl test set: 96.18 ## Intended uses & limitations - This model will be employed for an EU project. ## Training and evaluation data - The "clickbait_detection_dataset" was translated from English to Greek, Italian, Spanish, Romanian, French and German using the Opus-mt. - The dataset was also translated from English to Polish using the M2M NMT. - The "EasyNMT" library was utilized to employ the NMT models. ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1e-05 - train_batch_size: 16 - eval_batch_size: 16 - seed: 42 - gradient_accumulation_steps: 2 - total_train_batch_size: 32 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 4 ### Framework versions - Transformers 4.36.1 - Pytorch 2.1.0+cu121 - Datasets 2.13.1 - Tokenizers 0.15.0