Alizee
/

xlm-roberta-large-finetuned-wikiner-fr

Token Classification

Inference Endpoints

Model card Files Files and versions Community

Alizee commited on Jan 22

Commit

0f6a056

•

1 Parent(s): 2691dde

Update README.md

Files changed (1) hide show

README.md +30 -21

README.md CHANGED Viewed

@@ -1,16 +1,15 @@
 ---
 license: mit
 base_model: xlm-roberta-large
-tags:
-- generated_from_trainer
-metrics:
-- precision
-- recall
-- f1
-- accuracy
 model-index:
 - name: xlm-roberta-large-finetuned-wikiner-fr
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -18,27 +17,37 @@ should probably proofread and complete it, then remove this comment. -->
 # xlm-roberta-large-finetuned-wikiner-fr
-This model is a fine-tuned version of [xlm-roberta-large](https://huggingface.co/xlm-roberta-large) on the Alizee/wikiner_fr_mixed_caps dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.0518
-- Precision: 0.8881
-- Recall: 0.9014
-- F1: 0.8947
-- Accuracy: 0.9855
-## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
 ### Training hyperparameters
@@ -95,4 +104,4 @@ The following hyperparameters were used during training:
 - Transformers 4.36.2
 - Pytorch 2.0.1
 - Datasets 2.16.1
-- Tokenizers 0.15.0

 ---
 license: mit
 base_model: xlm-roberta-large
 model-index:
 - name: xlm-roberta-large-finetuned-wikiner-fr
   results: []
+datasets:
+- Alizee/wikiner_fr_mixed_caps
+pipeline_tag: token-classification
+language:
+- fr
+library_name: transformers
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 # xlm-roberta-large-finetuned-wikiner-fr
+This model is a fine-tuned version of [xlm-roberta-large](https://huggingface.co/xlm-roberta-large) on the [Alizee/wikiner_fr_mixed_caps](https://huggingface.co/datasets/Alizee/wikiner_fr_mixed_caps).
+## Why this model?
+Credits to [Jean-Baptiste](https://huggingface.co/Jean-Baptiste) for building the current "best" model for French NER "[camembert-ner](https://huggingface.co/Jean-Baptiste/camembert-ner)" based on wikiNER ([Jean-Baptiste/wikiner_fr](https://huggingface.co/datasets/Jean-Baptiste/wikiner_fr)).
+xlm-roberta-large models fine-tuned on conll03 [English](https://huggingface.co/FacebookAI/xlm-roberta-large-finetuned-conll03-english) and especially [German](https://huggingface.co/FacebookAI/xlm-roberta-large-finetuned-conll03-german) were outperforming the Camembert-NER model in my own tasks. This inspired me to build a French version of the xlm-roberta-large models based on the wikiNER dataset, with the hope to create a slightly improved standard for French 4-entity NER.
 ## Intended uses & limitations
+4-entity NER for French, with the following tags:
+Abbreviation|Description
+-|-
+O |Outside of a named entity
+MISC |Miscellaneous entity
+PER |Person’s name
+ORG |Organization
+LOC |Location
+## Performance
+It achieves the following results on the evaluation set:
+- Loss: 0.0518
+- Precision: 0.8881
+- Recall: 0.9014
+- F1: 0.8947
+- Accuracy: 0.9855
 ### Training hyperparameters
 - Transformers 4.36.2
 - Pytorch 2.0.1
 - Datasets 2.16.1
+- Tokenizers 0.15.0