File size: 2,362 Bytes
8c33f89
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
# TCMNER

# Model description

TCMNER is a fine-tuned BERT model that is ready to use for Named Entity Recognition of Traditional Chinese medicine and achieves state-of-the-art performance for the NER task. It has been trained to recognize six types of entities: prescription (方剂), herb (本草), source (来源), disease (病名), symptom (症状) and syndrome(证型).

Specifically, this model is a TCMRoBERTa model, a fine-tuned model of RoBERTa for Traditional Chinese medicine, that was fine-tuned on the Chinese version of the Haiwei AI Lab's Named Entity Recognition dataset.

**Currently, TCMRoBERTa is only a closed-source model for my own company, and I will open source it in the future.**


# How to use

You can use this model with Transformers pipeline for NER.

```
from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline

tokenizer = AutoTokenizer.from_pretrained("Monor/TCMNER")
model = AutoModelForTokenClassification.from_pretrained("Monor/TCMNER")

nlp = pipeline("ner", model=model, tokenizer=tokenizer)
example = "化滞汤,出处:《证治汇补》卷八。。组成:青皮20g,陈皮20g,厚朴20g,枳实20g,黄芩20g,黄连20g,当归20g,芍药20g,木香5g,槟榔8g,滑石3g,甘草4g。。主治:下痢因于食积气滞者。"

ner_results = nlp(example)
print(ner_results)
```


## Training data

This model was fine-tuned on My own dataset. 

Abbreviation|Description
-|-
O|Outside of a named entity
B-方剂 |Beginning of a prescription entity right after another prescription entity
I-方剂 | Prescription entity
B-本草 |Beginning of a herb entity right after another herb entity
I-本草 |Herb entity
B-来源 |Beginning of a soure of prescription right after another soure of prescription
I-来源 |Source entity
B-病名 |Beginning of a disease's name right after another disease's name
I-病名 |Disease's name
B-症状 |Beginning of a symptom right after another symptom
I-症状 |Symptom
B-证型 |Beginning of a syndrome right after another syndrome
I-证型 |Syndrome

# Eval results

![alt text](images/iShot_2024-06-07_18.03.00.png "Title")


# Notices

1. The moodel is commercially available for free.
2. I am not going to write a paper about this model, if you use any details of this model in your paper, please mention it, thanks.