|
--- |
|
|
|
tags: |
|
- token-classification |
|
datasets: |
|
- djagatiya/ner-ontonotes-v5-eng-v4 |
|
widget: |
|
- text: "On September 1st George won 1 dollar while watching Game of Thrones." |
|
|
|
--- |
|
|
|
# (NER) roberta-base : conll2012_ontonotesv5-english-v4 |
|
|
|
This `roberta-base` NER model was finetuned on `conll2012_ontonotesv5` version `english-v4` dataset. <br> |
|
Check out [NER-System Repository](https://github.com/djagatiya/NER-System) for more information. |
|
|
|
## Dataset |
|
- conll2012_ontonotesv5 |
|
- Language : English |
|
- Version : v4 |
|
|
|
| Dataset | Examples | |
|
| --- | --- | |
|
| Training | 75187 | |
|
| Testing | 9479 | |
|
|
|
## Evaluation |
|
|
|
- Precision: 88.88 |
|
- Recall: 90.69 |
|
- F1-Score: 89.78 |
|
|
|
> check out this [eval.log](eval.log) file for evaluation metrics and classification report. |
|
|
|
``` |
|
precision recall f1-score support |
|
|
|
CARDINAL 0.84 0.85 0.85 935 |
|
DATE 0.85 0.90 0.87 1602 |
|
EVENT 0.67 0.76 0.71 63 |
|
FAC 0.74 0.72 0.73 135 |
|
GPE 0.97 0.96 0.96 2240 |
|
LANGUAGE 0.83 0.68 0.75 22 |
|
LAW 0.66 0.62 0.64 40 |
|
LOC 0.74 0.80 0.77 179 |
|
MONEY 0.85 0.89 0.87 314 |
|
NORP 0.93 0.96 0.95 841 |
|
ORDINAL 0.81 0.89 0.85 195 |
|
ORG 0.90 0.91 0.91 1795 |
|
PERCENT 0.90 0.92 0.91 349 |
|
PERSON 0.95 0.95 0.95 1988 |
|
PRODUCT 0.74 0.83 0.78 76 |
|
QUANTITY 0.76 0.80 0.78 105 |
|
TIME 0.62 0.67 0.65 212 |
|
WORK_OF_ART 0.58 0.69 0.63 166 |
|
|
|
micro avg 0.89 0.91 0.90 11257 |
|
macro avg 0.80 0.82 0.81 11257 |
|
weighted avg 0.89 0.91 0.90 11257 |
|
``` |
|
|
|
## Usage |
|
|
|
``` |
|
from transformers import pipeline |
|
|
|
ner_pipeline = pipeline( |
|
'token-classification', |
|
model=r'djagatiya/ner-roberta-base-ontonotesv5-englishv4', |
|
aggregation_strategy='simple' |
|
) |
|
``` |
|
TEST 1 |
|
``` |
|
ner_pipeline("India is a beautiful country") |
|
``` |
|
|
|
``` |
|
# Output |
|
[{'entity_group': 'GPE', |
|
'score': 0.99186057, |
|
'word': ' India', |
|
'start': 0, |
|
'end': 5}] |
|
``` |
|
|
|
TEST 2 |
|
|
|
``` |
|
ner_pipeline("On September 1st George won 1 dollar while watching Game of Thrones.") |
|
``` |
|
|
|
``` |
|
# Output |
|
[{'entity_group': 'DATE', |
|
'score': 0.99720246, |
|
'word': ' September 1st', |
|
'start': 3, |
|
'end': 16}, |
|
{'entity_group': 'PERSON', |
|
'score': 0.99071586, |
|
'word': ' George', |
|
'start': 17, |
|
'end': 23}, |
|
{'entity_group': 'MONEY', |
|
'score': 0.9872978, |
|
'word': ' 1 dollar', |
|
'start': 28, |
|
'end': 36}, |
|
{'entity_group': 'WORK_OF_ART', |
|
'score': 0.9946732, |
|
'word': ' Game of Thrones', |
|
'start': 52, |
|
'end': 67}] |
|
``` |