File size: 2,768 Bytes
05ad15b 6e3bb67 05ad15b a510668 05ad15b f32742c 05ad15b 80c64c7 4ae3174 f32742c d4ca3a3 65b4d2b 05ad15b bbe6aaa a4667c3 a49770a 050f756 917b53b 050f756 917b53b 050f756 bbe6aaa a4667c3 b095228 457c6f9 1477c5e 5dbb0a2 1477c5e 457c6f9 a4667c3 b095228 bbe6aaa a4667c3 b095228 457c6f9 1477c5e 91bbf8a 1477c5e 91bbf8a 1477c5e 91bbf8a 1477c5e 457c6f9 a4667c3 b095228 bbe6aaa a4667c3 b095228 d1960b3 6dae520 d1960b3 6dae520 d1960b3 8f6df3b d1960b3 8f6df3b d1960b3 8f6df3b d1960b3 8f6df3b d1960b3 a4667c3 b095228 bbe6aaa a4667c3 b095228 a8c5a58 b095228 a8c5a58 b095228 a8c5a58 b095228 a8c5a58 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 |
---
license: apache-2.0
base_model: facebook/bart-base
datasets:
- squad_v2
- drop
- mou3az/IT_QA-QG
language:
- en
library_name: peft
tags:
- IT purpose
- General purpose
- Text2text Generation
metrics:
- bertscore
- accuracy
- rouge
---
# Model Card
Base Model: facebook/bart-base
Fine-tuned : using PEFT-LoRa
Datasets : squad_v2, drop, mou3az/IT_QA-QG
Task: Generating questions from context and answers
Language: English
# Loading the model
```python
from peft import PeftModel, PeftConfig
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
HUGGING_FACE_USER_NAME = "mou3az"
model_name = "IT-General_Question-Generation "
peft_model_id = f"{HUGGING_FACE_USER_NAME}/{model_name}"
config = PeftConfig.from_pretrained(peft_model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=False, device_map='auto')
QG_tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
QG_model = PeftModel.from_pretrained(model, peft_model_id)
```
# At inference time
```python
def get_question(context, answer):
device = next(QG_model.parameters()).device
input_text = f"Given the context '{context}' and the answer '{answer}', what question can be asked?"
encoding = QG_tokenizer.encode_plus(input_text, padding=True, return_tensors="pt").to(device)
output_tokens = QG_model.generate(**encoding, early_stopping=True, num_beams=5, num_return_sequences=1, no_repeat_ngram_size=2, max_length=100)
out = QG_tokenizer.decode(output_tokens[0], skip_special_tokens=True).replace("question:", "").strip()
return out
```
# Training parameters and hyperparameters
The following were used during training:
For Lora:
r=18
alpha=8
For training arguments:
gradient_accumulation_steps=24
per_device_train_batch_size=8
per_device_eval_batch_size=8
max_steps=1000
warmup_steps=50
weight_decay=0.05
learning_rate=3e-3
lr_scheduler_type="linear"
# Training Results
| Epoch | Optimization Step | Training Loss | Validation Loss |
|-------|-------------------|---------------|-----------------|
| 0.0 | 84 | 4.6426 | 4.704238 |
| 3.0 | 252 | 1.5094 | 1.202135 |
| 6.0 | 504 | 1.2677 | 1.146177 |
| 9.0 | 756 | 1.2613 | 1.112074 |
| 12.0 | 1000 | 1.1958 | 1.109059 |
# Performance Metrics on Evaluation Set:
Training Loss: 1.1.1958
Evaluation Loss: 1.109059
Bertscore: 0.8123
Rouge: 0.532144
Fuzzywizzy similarity: 0.74209 |