|
--- |
|
language: |
|
- ms |
|
datasets: |
|
- squad_v2 |
|
metrics: |
|
- exact_match |
|
- f1 |
|
--- |
|
|
|
# Overview |
|
This model is an experiment I and my friend did as a researcher internship at the National University of Singapore (NUS). We finetuned the model to our datasets in Finance and Healthcare domain, in the Malay Language. |
|
|
|
# Details |
|
- Finetuned from the base model by [zhufy](https://huggingface.co/zhufy/squad-ms-bert-base) |
|
- The base datasets from [SQuAD2.0](https://rajpurkar.github.io/SQuAD-explorer/) |
|
- Our [datasets](https://ids.nus.edu.sg/microsites/nzsg-nlp/datahub.html) in Finance and Healthcare domain |
|
|
|
# Finetuned Detail |
|
```py |
|
from transformers import TrainingArguments |
|
|
|
training_args = TrainingArguments( |
|
output_dir='test_trainer', |
|
evaluation_strategy='epoch', |
|
num_train_epochs=20, |
|
optim='adamw_torch', |
|
report_to='all', |
|
logging_steps=1, |
|
) |
|
``` |
|
|
|
# How to use the Model |
|
```py |
|
from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline |
|
|
|
model_name = "primasr/malaybert-for-eqa-finetuned" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForQuestionAnswering.from_pretrained(model_name) |
|
nlp = pipeline("question-answering", model=model, tokenizer=tokenizer) |
|
``` |