File size: 2,768 Bytes

05ad15b
 
6e3bb67
05ad15b
 
 
 
 
 
a510668
05ad15b
f32742c
05ad15b
80c64c7
4ae3174
f32742c
d4ca3a3
65b4d2b
05ad15b
bbe6aaa
a4667c3
a49770a
050f756
917b53b
050f756
917b53b
050f756
 
 
 
 
 
 
bbe6aaa
a4667c3
b095228
457c6f9
1477c5e
 
 
5dbb0a2
1477c5e
 
 
 
 
457c6f9
a4667c3
b095228
bbe6aaa
a4667c3
b095228
457c6f9
1477c5e
91bbf8a
1477c5e
91bbf8a
1477c5e
91bbf8a
 
1477c5e
 
457c6f9
a4667c3
b095228
bbe6aaa
a4667c3
b095228
d1960b3
 
6dae520
d1960b3
 
 
 
 
 
6dae520
d1960b3
 
8f6df3b
d1960b3
8f6df3b
d1960b3
8f6df3b
d1960b3
8f6df3b
d1960b3
 
 
 
 
 
 
a4667c3
b095228
bbe6aaa
a4667c3
b095228
a8c5a58
 
 
 
 
 
 
b095228
 
 
 
 
 
 
 
 
a8c5a58
b095228
a8c5a58
b095228
a8c5a58

---
license: apache-2.0
base_model: facebook/bart-base
datasets:
- squad_v2
- drop
- mou3az/IT_QA-QG
language:
- en
library_name: peft
tags:
- IT purpose
- General purpose
- Text2text Generation
metrics:
- bertscore
- accuracy
- rouge
---
# Model Card 

  
  Base Model: facebook/bart-base
  
  Fine-tuned : using PEFT-LoRa 
  
  Datasets : squad_v2, drop, mou3az/IT_QA-QG 
  
  Task: Generating questions from context and answers 
  
  Language: English 


# Loading the model 


  ```python
    from peft import PeftModel, PeftConfig
    from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
    HUGGING_FACE_USER_NAME = "mou3az"
    model_name = "IT-General_Question-Generation "
    peft_model_id = f"{HUGGING_FACE_USER_NAME}/{model_name}"
    config = PeftConfig.from_pretrained(peft_model_id)
    model = AutoModelForSeq2SeqLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=False, device_map='auto')
    QG_tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
    QG_model = PeftModel.from_pretrained(model, peft_model_id)
  ```


# At inference time


  ```python
    def get_question(context, answer):
        device = next(QG_model.parameters()).device
        input_text = f"Given the context '{context}' and the answer '{answer}', what question can be asked?"
        encoding = QG_tokenizer.encode_plus(input_text, padding=True, return_tensors="pt").to(device)
    
        output_tokens = QG_model.generate(**encoding, early_stopping=True, num_beams=5, num_return_sequences=1, no_repeat_ngram_size=2, max_length=100)
        out = QG_tokenizer.decode(output_tokens[0], skip_special_tokens=True).replace("question:", "").strip()
    
        return out
  ```


# Training parameters and hyperparameters


  The following were used during training:

  For Lora:

  r=18

  alpha=8


  For training arguments:

  gradient_accumulation_steps=24
  
  per_device_train_batch_size=8
  
  per_device_eval_batch_size=8
  
  max_steps=1000
  
  warmup_steps=50
  
  weight_decay=0.05
  
  learning_rate=3e-3
  
  lr_scheduler_type="linear"


# Training Results


| Epoch | Optimization Step | Training Loss | Validation Loss |
|-------|-------------------|---------------|-----------------|
| 0.0   | 84                | 4.6426        | 4.704238        |
| 3.0   | 252               | 1.5094        | 1.202135        |
| 6.0   | 504               | 1.2677        | 1.146177        |
| 9.0   | 756               | 1.2613        | 1.112074        |
| 12.0  | 1000              | 1.1958        | 1.109059        |


# Performance Metrics on Evaluation Set:


    Training Loss: 1.1.1958
    
    Evaluation Loss: 1.109059
  
    Bertscore: 0.8123
  
    Rouge: 0.532144
  
    Fuzzywizzy similarity: 0.74209