SajjadAyoubi
commited on
Commit
•
8e071d1
1
Parent(s):
80b9388
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,98 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
### How to use
|
2 |
+
#### Requirements
|
3 |
+
|
4 |
+
Transformers require `transformers` and `sentencepiece`, both of which can be
|
5 |
+
installed using `pip`.
|
6 |
+
|
7 |
+
```sh
|
8 |
+
pip install transformers sentencepiece
|
9 |
+
```
|
10 |
+
|
11 |
+
#### Pipelines 🚀
|
12 |
+
|
13 |
+
In case you are not familiar with Transformers, you can use pipelines instead.
|
14 |
+
|
15 |
+
Note that, pipelines can't have _no answer_ for the questions.
|
16 |
+
|
17 |
+
```python
|
18 |
+
from transformers import pipeline
|
19 |
+
|
20 |
+
model_name = "SajjadAyoubi/lm-roberta-large-fa-qa"
|
21 |
+
qa_pipeline = pipeline("question-answering", model=model_name, tokenizer=model_name)
|
22 |
+
|
23 |
+
text = "سلام من سجاد ایوبی هستم ۲۰ سالمه و به پردازش زبان طبیعی علاقه دارم"
|
24 |
+
questions = ["اسمم چیه؟", "چند سالمه؟", "به چی علاقه دارم؟"]
|
25 |
+
|
26 |
+
for question in questions:
|
27 |
+
print(qa_pipeline({"context": text, "question": question}))
|
28 |
+
|
29 |
+
>>> {'score': 0.4839823544025421, 'start': 8, 'end': 18, 'answer': 'سجاد ایوبی'}
|
30 |
+
>>> {'score': 0.3747948706150055, 'start': 24, 'end': 32, 'answer': '۲۰ سالمه'}
|
31 |
+
>>> {'score': 0.5945395827293396, 'start': 38, 'end': 55, 'answer': 'پردازش زبان طبیعی'}
|
32 |
+
```
|
33 |
+
|
34 |
+
#### Manual approach 🔥
|
35 |
+
|
36 |
+
Using the Manual approach, it is possible to have _no answer_ with even better
|
37 |
+
performance.
|
38 |
+
|
39 |
+
- PyTorch
|
40 |
+
|
41 |
+
```python
|
42 |
+
from transformers import AutoTokenizer, AutoModelForQuestionAnswering
|
43 |
+
from src.utils import AnswerPredictor
|
44 |
+
|
45 |
+
model_name = "SajjadAyoubi/lm-roberta-large-fa-qa"
|
46 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
47 |
+
model = AutoModelForQuestionAnswering.from_pretrained(model_name)
|
48 |
+
|
49 |
+
text = "سلام من سجاد ایوبی هستم ۲۰ سالمه و به پردازش زبان طبیعی علاقه دارم"
|
50 |
+
questions = ["اسمم چیه؟", "چند سالمه؟", "به چی علاقه دارم؟"]
|
51 |
+
|
52 |
+
# this class is from src/utils.py and you can read more about it
|
53 |
+
predictor = AnswerPredictor(model, tokenizer, device="cpu", n_best=10)
|
54 |
+
preds = predictor(questions, [text] * 3, batch_size=3)
|
55 |
+
|
56 |
+
for k, v in preds.items():
|
57 |
+
print(v)
|
58 |
+
```
|
59 |
+
|
60 |
+
Produces an output such below:
|
61 |
+
```
|
62 |
+
100%|██████████| 1/1 [00:00<00:00, 3.56it/s]
|
63 |
+
{'score': 8.040637016296387, 'text': 'سجاد ایوبی'}
|
64 |
+
{'score': 9.901972770690918, 'text': '۲۰'}
|
65 |
+
{'score': 12.117212295532227, 'text': 'پردازش زبان طبیعی'}
|
66 |
+
```
|
67 |
+
|
68 |
+
- TensorFlow 2.X
|
69 |
+
|
70 |
+
```python
|
71 |
+
from transformers import AutoTokenizer, TFAutoModelForQuestionAnswering
|
72 |
+
from src.utils import TFAnswerPredictor
|
73 |
+
|
74 |
+
model_name = "SajjadAyoubi/lm-roberta-large-fa-qa"
|
75 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
76 |
+
model = TFAutoModelForQuestionAnswering.from_pretrained(model_name)
|
77 |
+
|
78 |
+
text = "سلام من سجاد ایوبی هستم ۲۰ سالمه و به پردازش زبان طبیعی علاقه دارم"
|
79 |
+
questions = ["اسمم چیه؟", "چند سالمه؟", "به چی علاقه دارم؟"]
|
80 |
+
|
81 |
+
# this class is from src/utils.py, you can read more about it
|
82 |
+
predictor = TFAnswerPredictor(model, tokenizer, n_best=10)
|
83 |
+
preds = predictor(questions, [text] * 3, batch_size=3)
|
84 |
+
|
85 |
+
for k, v in preds.items():
|
86 |
+
print(v)
|
87 |
+
```
|
88 |
+
|
89 |
+
Produces an output such below:
|
90 |
+
|
91 |
+
```text
|
92 |
+
100%|██████████| 1/1 [00:00<00:00, 3.56it/s]
|
93 |
+
{'score': 8.040637016296387, 'text': 'سجاد ایوبی'}
|
94 |
+
{'score': 9.901972770690918, 'text': '۲۰'}
|
95 |
+
{'score': 12.117212295532227, 'text': 'پردازش زبان طبیعی'}
|
96 |
+
```
|
97 |
+
|
98 |
+
Or you can access the whole demonstration using [HowToUse iPython Notebook on Google Colab](https://colab.research.google.com/github/sajjjadayobi/PersianQA/blob/main/notebooks/HowToUse.ipynb)
|