File size: 5,457 Bytes
f530fe4 1949435 f530fe4 0f6a056 f530fe4 0f6a056 49d37c5 0f6a056 49d37c5 f530fe4 0f6a056 49d37c5 0f6a056 f530fe4 0f6a056 f530fe4 e6cf178 f530fe4 49d37c5 f530fe4 49d37c5 e6cf178 49d37c5 f530fe4 0f6a056 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 |
---
license: mit
base_model: FacebookAI/xlm-roberta-large
model-index:
- name: xlm-roberta-large-finetuned-wikiner-fr
results: []
datasets:
- Alizee/wikiner_fr_mixed_caps
pipeline_tag: token-classification
language:
- fr
library_name: transformers
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# xlm-roberta-large-finetuned-wikiner-fr
This model is a fine-tuned version of [xlm-roberta-large](https://huggingface.co/xlm-roberta-large) on the [Alizee/wikiner_fr_mixed_caps](https://huggingface.co/datasets/Alizee/wikiner_fr_mixed_caps).
## Why this model?
Credits to [Jean-Baptiste](https://huggingface.co/Jean-Baptiste) for building the current "best" model for French NER "[camembert-ner](https://huggingface.co/Jean-Baptiste/camembert-ner)" based on wikiNER ([Jean-Baptiste/wikiner_fr](https://huggingface.co/datasets/Jean-Baptiste/wikiner_fr)).
xlm-roberta-large models fine-tuned on conll03 [English](https://huggingface.co/FacebookAI/xlm-roberta-large-finetuned-conll03-english) and especially [German](https://huggingface.co/FacebookAI/xlm-roberta-large-finetuned-conll03-german) were outperforming the Camembert-NER model in my own tasks. This inspired me to build a French version of the xlm-roberta-large models based on the wikiNER dataset, with the hope to create a slightly improved standard for French 4-entity NER.
## Intended uses & limitations
4-entity NER for French, with the following tags:
Abbreviation|Description
-|-
O |Outside of a named entity
MISC |Miscellaneous entity
PER |Person’s name
ORG |Organization
LOC |Location
## Performance
It achieves the following results on the evaluation set:
- Loss: 0.0518
- Precision: 0.8881
- Recall: 0.9014
- F1: 0.8947
- Accuracy: 0.9855
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1.5e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.02
- num_epochs: 3
### Training results
| Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy |
|:-------------:|:-----:|:-----:|:---------------:|:---------:|:------:|:------:|:--------:|
| 0.1032 | 0.1 | 374 | 0.0853 | 0.7645 | 0.8170 | 0.7899 | 0.9742 |
| 0.0767 | 0.2 | 748 | 0.0721 | 0.8111 | 0.8423 | 0.8264 | 0.9785 |
| 0.074 | 0.3 | 1122 | 0.0655 | 0.8252 | 0.8502 | 0.8375 | 0.9797 |
| 0.0634 | 0.4 | 1496 | 0.0629 | 0.8423 | 0.8694 | 0.8556 | 0.9809 |
| 0.0605 | 0.5 | 1870 | 0.0610 | 0.8515 | 0.8711 | 0.8612 | 0.9808 |
| 0.0578 | 0.6 | 2244 | 0.0594 | 0.8633 | 0.8744 | 0.8688 | 0.9822 |
| 0.0592 | 0.7 | 2618 | 0.0555 | 0.8624 | 0.8833 | 0.8727 | 0.9825 |
| 0.0567 | 0.8 | 2992 | 0.0534 | 0.8626 | 0.8838 | 0.8731 | 0.9830 |
| 0.0522 | 0.9 | 3366 | 0.0563 | 0.8560 | 0.8771 | 0.8664 | 0.9818 |
| 0.0516 | 1.0 | 3739 | 0.0556 | 0.8702 | 0.8869 | 0.8785 | 0.9831 |
| 0.0438 | 1.0 | 3740 | 0.0558 | 0.8712 | 0.8873 | 0.8792 | 0.9831 |
| 0.0395 | 1.1 | 4114 | 0.0565 | 0.8696 | 0.8856 | 0.8775 | 0.9830 |
| 0.0371 | 1.2 | 4488 | 0.0536 | 0.8762 | 0.8910 | 0.8835 | 0.9838 |
| 0.0403 | 1.3 | 4862 | 0.0531 | 0.8709 | 0.8887 | 0.8797 | 0.9835 |
| 0.0366 | 1.4 | 5236 | 0.0517 | 0.8791 | 0.8912 | 0.8851 | 0.9843 |
| 0.037 | 1.5 | 5610 | 0.0510 | 0.8830 | 0.8936 | 0.8883 | 0.9847 |
| 0.0368 | 1.6 | 5984 | 0.0492 | 0.8795 | 0.8940 | 0.8867 | 0.9845 |
| 0.0359 | 1.7 | 6358 | 0.0501 | 0.8833 | 0.8986 | 0.8909 | 0.9850 |
| 0.034 | 1.8 | 6732 | 0.0496 | 0.8852 | 0.8986 | 0.8918 | 0.9852 |
| 0.0327 | 1.9 | 7106 | 0.0512 | 0.8762 | 0.8948 | 0.8854 | 0.9843 |
| 0.0325 | 2.0 | 7478 | 0.0512 | 0.8829 | 0.8945 | 0.8887 | 0.9844 |
| 0.01 | 2.0 | 7480 | 0.0512 | 0.8836 | 0.8945 | 0.8890 | 0.9843 |
| 0.0232 | 2.1 | 7854 | 0.0526 | 0.8870 | 0.9002 | 0.8936 | 0.9852 |
| 0.0235 | 2.2 | 8228 | 0.0530 | 0.8841 | 0.8983 | 0.8911 | 0.9848 |
| 0.0211 | 2.3 | 8602 | 0.0542 | 0.8875 | 0.9008 | 0.8941 | 0.9852 |
| 0.0235 | 2.4 | 8976 | 0.0525 | 0.8883 | 0.9008 | 0.8945 | 0.9855 |
| 0.0232 | 2.5 | 9350 | 0.0525 | 0.8874 | 0.9013 | 0.8943 | 0.9855 |
| 0.0238 | 2.6 | 9724 | 0.0517 | 0.8861 | 0.9011 | 0.8935 | 0.9854 |
| 0.0223 | 2.7 | 10098 | 0.0513 | 0.8893 | 0.9016 | 0.8954 | 0.9856 |
| 0.0226 | 2.8 | 10472 | 0.0517 | 0.8892 | 0.9017 | 0.8954 | 0.9856 |
| 0.0228 | 2.9 | 10846 | 0.0517 | 0.8879 | 0.9013 | 0.8945 | 0.9855 |
| 0.0235 | 3.0 | 11217 | 0.0518 | 0.8881 | 0.9014 | 0.8947 | 0.9855 |
### Framework versions
- Transformers 4.36.2
- Pytorch 2.0.1
- Datasets 2.16.1
- Tokenizers 0.15.0 |