File size: 2,640 Bytes
05ad15b 5e6e61e a4667c3 1477c5e a49770a 917b53b 1477c5e 917b53b 1477c5e a4667c3 5e6e61e a4667c3 457c6f9 1477c5e 457c6f9 a4667c3 5e6e61e a4667c3 457c6f9 1477c5e 457c6f9 a4667c3 5e6e61e a4667c3 1477c5e 8f6df3b 1477c5e 8f6df3b 1477c5e 8f6df3b 1477c5e 8f6df3b 1477c5e a4667c3 5e6e61e a4667c3 1477c5e 05ad15b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 |
---
license: apache-2.0
datasets:
- squad_v2
- drop
- mou3az/IT_QA-QG
language:
- en
library_name: transformers
pipeline_tag: text2text-generation
tags:
- 'IT purpose '
- General purpose
---
### Model Card ###
# Information:
Base Model: facebook/bart-base
Fine-tuned : using PEFT-LoRa
Datasets : squad_v2, drop, mou3az/IT_QA-QG
Task: Generating questions from context and answers
Language: English
# Performance Metrics on Evaluation Set:
Training Loss: 1.1.1958
Evaluation Loss: 1.109059
### Loading the model ###
```python
from peft import PeftModel, PeftConfig
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
HUGGING_FACE_USER_NAME = "mou3az"
model_name = "ITandGeneral_Question-Generation"
peft_model_id = f"{HUGGING_FACE_USER_NAME}/{model_name}"
config = PeftConfig.from_pretrained(peft_model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=False, device_map='auto')
QG_tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
QG_model = PeftModel.from_pretrained(model, peft_model_id)
```
### At inference time ###
```python
def get_question(context, answer):
device = next(L_model.parameters()).device
input_text = f"Given the context '{context}' and the answer '{answer}', what question can be asked?"
encoding = G_tokenizer.encode_plus(input_text, padding=True, return_tensors="pt").to(device)
output_tokens = L_model.generate(**encoding, early_stopping=True, num_beams=5, num_return_sequences=1, no_repeat_ngram_size=2, max_length=100)
out = G_tokenizer.decode(output_tokens[0], skip_special_tokens=True).replace("question:", "").strip()
return out
```
### Training parameters and hyperparameters ###
The following were used during training:
# For Lora:
r=18
alpha=8
# For training arguments:
gradient_accumulation_steps=24
per_device_train_batch_size=8
per_device_eval_batch_size=8
max_steps=1000
warmup_steps=50
weight_decay=0.05
learning_rate=3e-3
lr_scheduler_type="linear"
### Training Results ###
| Epoch | Training Loss | Validation Loss |
|-------|---------------|-----------------|
| 0.0 | 4.6426 | 4.704238 |
| 3.0 | 1.5094 | 1.202135 |
| 6.0 | 1.2677 | 1.146177 |
| 9.0 | 1.2613 | 1.112074 |
| 12.0 | 1.1958 | 1.109059 | |