File size: 5,457 Bytes

---
license: mit
base_model: FacebookAI/xlm-roberta-large
model-index:
- name: xlm-roberta-large-finetuned-wikiner-fr
  results: []
datasets:
- Alizee/wikiner_fr_mixed_caps
pipeline_tag: token-classification
language:
- fr
library_name: transformers
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# xlm-roberta-large-finetuned-wikiner-fr

This model is a fine-tuned version of [xlm-roberta-large](https://huggingface.co/xlm-roberta-large) on the [Alizee/wikiner_fr_mixed_caps](https://huggingface.co/datasets/Alizee/wikiner_fr_mixed_caps).


## Why this model?

Credits to [Jean-Baptiste](https://huggingface.co/Jean-Baptiste) for building the current "best" model for French NER "[camembert-ner](https://huggingface.co/Jean-Baptiste/camembert-ner)" based on wikiNER ([Jean-Baptiste/wikiner_fr](https://huggingface.co/datasets/Jean-Baptiste/wikiner_fr)).

xlm-roberta-large models fine-tuned on conll03 [English](https://huggingface.co/FacebookAI/xlm-roberta-large-finetuned-conll03-english) and especially [German](https://huggingface.co/FacebookAI/xlm-roberta-large-finetuned-conll03-german) were outperforming the Camembert-NER model in my own tasks. This inspired me to build a French version of the xlm-roberta-large models based on the wikiNER dataset, with the hope to create a slightly improved standard for French 4-entity NER.


## Intended uses & limitations

4-entity NER for French, with the following tags:

Abbreviation|Description
-|-
O |Outside of a named entity
MISC |Miscellaneous entity
PER |Person’s name
ORG |Organization
LOC |Location

## Performance

It achieves the following results on the evaluation set:
- Loss: 0.0518
- Precision: 0.8881
- Recall: 0.9014
- F1: 0.8947
- Accuracy: 0.9855


### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1.5e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.02
- num_epochs: 3

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Precision | Recall | F1     | Accuracy |
|:-------------:|:-----:|:-----:|:---------------:|:---------:|:------:|:------:|:--------:|
| 0.1032        | 0.1   | 374   | 0.0853          | 0.7645    | 0.8170 | 0.7899 | 0.9742   |
| 0.0767        | 0.2   | 748   | 0.0721          | 0.8111    | 0.8423 | 0.8264 | 0.9785   |
| 0.074         | 0.3   | 1122  | 0.0655          | 0.8252    | 0.8502 | 0.8375 | 0.9797   |
| 0.0634        | 0.4   | 1496  | 0.0629          | 0.8423    | 0.8694 | 0.8556 | 0.9809   |
| 0.0605        | 0.5   | 1870  | 0.0610          | 0.8515    | 0.8711 | 0.8612 | 0.9808   |
| 0.0578        | 0.6   | 2244  | 0.0594          | 0.8633    | 0.8744 | 0.8688 | 0.9822   |
| 0.0592        | 0.7   | 2618  | 0.0555          | 0.8624    | 0.8833 | 0.8727 | 0.9825   |
| 0.0567        | 0.8   | 2992  | 0.0534          | 0.8626    | 0.8838 | 0.8731 | 0.9830   |
| 0.0522        | 0.9   | 3366  | 0.0563          | 0.8560    | 0.8771 | 0.8664 | 0.9818   |
| 0.0516        | 1.0   | 3739  | 0.0556          | 0.8702    | 0.8869 | 0.8785 | 0.9831   |
| 0.0438        | 1.0   | 3740  | 0.0558          | 0.8712    | 0.8873 | 0.8792 | 0.9831   |
| 0.0395        | 1.1   | 4114  | 0.0565          | 0.8696    | 0.8856 | 0.8775 | 0.9830   |
| 0.0371        | 1.2   | 4488  | 0.0536          | 0.8762    | 0.8910 | 0.8835 | 0.9838   |
| 0.0403        | 1.3   | 4862  | 0.0531          | 0.8709    | 0.8887 | 0.8797 | 0.9835   |
| 0.0366        | 1.4   | 5236  | 0.0517          | 0.8791    | 0.8912 | 0.8851 | 0.9843   |
| 0.037         | 1.5   | 5610  | 0.0510          | 0.8830    | 0.8936 | 0.8883 | 0.9847   |
| 0.0368        | 1.6   | 5984  | 0.0492          | 0.8795    | 0.8940 | 0.8867 | 0.9845   |
| 0.0359        | 1.7   | 6358  | 0.0501          | 0.8833    | 0.8986 | 0.8909 | 0.9850   |
| 0.034         | 1.8   | 6732  | 0.0496          | 0.8852    | 0.8986 | 0.8918 | 0.9852   |
| 0.0327        | 1.9   | 7106  | 0.0512          | 0.8762    | 0.8948 | 0.8854 | 0.9843   |
| 0.0325        | 2.0   | 7478  | 0.0512          | 0.8829    | 0.8945 | 0.8887 | 0.9844   |
| 0.01          | 2.0   | 7480  | 0.0512          | 0.8836    | 0.8945 | 0.8890 | 0.9843   |
| 0.0232        | 2.1   | 7854  | 0.0526          | 0.8870    | 0.9002 | 0.8936 | 0.9852   |
| 0.0235        | 2.2   | 8228  | 0.0530          | 0.8841    | 0.8983 | 0.8911 | 0.9848   |
| 0.0211        | 2.3   | 8602  | 0.0542          | 0.8875    | 0.9008 | 0.8941 | 0.9852   |
| 0.0235        | 2.4   | 8976  | 0.0525          | 0.8883    | 0.9008 | 0.8945 | 0.9855   |
| 0.0232        | 2.5   | 9350  | 0.0525          | 0.8874    | 0.9013 | 0.8943 | 0.9855   |
| 0.0238        | 2.6   | 9724  | 0.0517          | 0.8861    | 0.9011 | 0.8935 | 0.9854   |
| 0.0223        | 2.7   | 10098 | 0.0513          | 0.8893    | 0.9016 | 0.8954 | 0.9856   |
| 0.0226        | 2.8   | 10472 | 0.0517          | 0.8892    | 0.9017 | 0.8954 | 0.9856   |
| 0.0228        | 2.9   | 10846 | 0.0517          | 0.8879    | 0.9013 | 0.8945 | 0.9855   |
| 0.0235        | 3.0   | 11217 | 0.0518          | 0.8881    | 0.9014 | 0.8947 | 0.9855   |


### Framework versions

- Transformers 4.36.2
- Pytorch 2.0.1
- Datasets 2.16.1
- Tokenizers 0.15.0