File size: 1,219 Bytes
bd779f8
 
 
 
 
 
 
 
 
 
 
 
 
 
68a014f
bd779f8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
---
language:
- ms
datasets:
- squad_v2
metrics:
- exact_match
- f1
---

# Overview
This model is an experiment I and my friend did as a researcher internship at the National University of Singapore (NUS). We finetuned the model to our datasets in Finance and Healthcare domain, in the Malay Language.

# Details
- Finetuned from the base model by [zhufy](https://huggingface.co/zhufy/squad-ms-bert-base)
- The base datasets from [SQuAD2.0](https://rajpurkar.github.io/SQuAD-explorer/)
- Our [datasets](https://ids.nus.edu.sg/microsites/nzsg-nlp/datahub.html) in Finance and Healthcare domain

# Finetuned Detail
```py
from transformers import TrainingArguments

training_args = TrainingArguments(
    output_dir='test_trainer', 
    evaluation_strategy='epoch', 
    num_train_epochs=20, 
    optim='adamw_torch',
    report_to='all',
    logging_steps=1,
)
```

# How to use the Model
```py
from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline

model_name = "primasr/malaybert-for-eqa-finetuned"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForQuestionAnswering.from_pretrained(model_name)
nlp = pipeline("question-answering", model=model, tokenizer=tokenizer)
```