File size: 2,558 Bytes
cc99a14
 
 
 
 
 
 
 
 
 
3450d25
cc99a14
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c2fe14b
cc99a14
c2fe14b
 
cc99a14
3450d25
c2fe14b
 
cc99a14
c2fe14b
 
 
 
 
 
 
 
 
cc99a14
 
 
 
3450d25
 
 
cc99a14
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
---
tags:
- flair
- token-classification
- sequence-tagger-model
language: fa
dataset:
- NSURL-2019
widget:
- text: >-
    اولین نمایش این فیلم‌ها روز دوشنبه 13 اردیبهشت و از ساعت 21 در موزه سینماست.
metrics:
- f1
---

## Persian NER Using Flair

This is the 7-class Named-entity recognition model for Persian that ships with [Flair](https://github.com/flairNLP/flair/).

F1-Score: **90.33** (NSURL-2019)

Predicts NER tags:

| **tag**                        | **meaning** |
|:---------------------------------:|:-----------:|
| PER         | person name | 
| LOC         | location name | 
| ORG         | organization name | 
| DAT         | date |
| TIM         | time |
| PCT         | percent |
| MON         | Money|

Based on [Flair embeddings](https://www.aclweb.org/anthology/C18-1139/) and Pars-Bert.

---

### Demo: How to use in Flair

Requires: **[Flair](https://github.com/flairNLP/flair/)** (`pip install flair`)

```python
from flair.data import Sentence
from flair.models import SequenceTagger

# load tagger
tagger = SequenceTagger.load("PooryaPiroozfar/Flair-Persian-NER")

# make example sentence
sentence = Sentence("اولین نمایش این فیلم‌ها روز دوشنبه 13 اردیبهشت و از ساعت 21 در موزه سینماست.")

# predict NER tags
tagger.predict(sentence)

# print sentence
print(sentence)

# print predicted NER spans
print('The following NER tags are found:')
# iterate over entities and print
for entity in sentence.get_spans('ner'):
    print(entity)
```

This yields the following output:
```
Span[4:8]: "روز دوشنبه 13 اردیبهشت" → DAT (1.0)
Span[10:12]: "ساعت 21" → TIM (1.0)
Span[13:15]: "موزه سینماست" → LOC (0.9999)

```

---

### Results
- F-score (micro) 0.9033
- F-score (macro) 0.8976
- Accuracy 0.8277

```
By class:
              precision    recall  f1-score   support

         ORG     0.9016    0.8667    0.8838      1523
         LOC     0.9113    0.9305    0.9208      1425
         PER     0.9216    0.9322    0.9269      1224
         DAT     0.8623    0.7958    0.8277       480
         MON     0.9665    0.9558    0.9611       181
         PCT     0.9375    0.9740    0.9554        77
         TIM     0.8235    0.7925    0.8077        53

   micro avg     0.9081    0.8984    0.9033      4963
   macro avg     0.9035    0.8925    0.8976      4963
weighted avg     0.9076    0.8984    0.9028      4963
 samples avg     0.8277    0.8277    0.8277      4963
              
```