File size: 2,640 Bytes
05ad15b
 
 
 
 
 
 
 
 
 
 
 
 
 
5e6e61e
a4667c3
1477c5e
 
 
 
 
 
 
 
 
 
 
a49770a
917b53b
1477c5e
917b53b
1477c5e
 
 
a4667c3
 
5e6e61e
a4667c3
457c6f9
1477c5e
 
 
 
 
 
 
 
 
457c6f9
a4667c3
5e6e61e
a4667c3
457c6f9
1477c5e
 
 
 
 
 
 
 
 
457c6f9
a4667c3
5e6e61e
a4667c3
1477c5e
8f6df3b
1477c5e
8f6df3b
1477c5e
8f6df3b
1477c5e
 
 
 
8f6df3b
1477c5e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a4667c3
5e6e61e
a4667c3
1477c5e
 
 
 
 
 
05ad15b
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
---
license: apache-2.0
datasets:
- squad_v2
- drop
- mou3az/IT_QA-QG
language:
- en
library_name: transformers
pipeline_tag: text2text-generation
tags:
- 'IT purpose '
- General purpose
---
### Model Card ###

    # Information:
    
    Base Model: facebook/bart-base
    
    Fine-tuned : using PEFT-LoRa 
    
    Datasets : squad_v2, drop, mou3az/IT_QA-QG 
    
    Task: Generating questions from context and answers 
    
    Language: English 
  
  
    # Performance Metrics on Evaluation Set:
  
    Training Loss: 1.1.1958
    
    Evaluation Loss: 1.109059


### Loading the model ###

  ```python
    from peft import PeftModel, PeftConfig
    from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
    HUGGING_FACE_USER_NAME = "mou3az"
    model_name = "ITandGeneral_Question-Generation"
    peft_model_id = f"{HUGGING_FACE_USER_NAME}/{model_name}"
    config = PeftConfig.from_pretrained(peft_model_id)
    model = AutoModelForSeq2SeqLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=False, device_map='auto')
    QG_tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
    QG_model = PeftModel.from_pretrained(model, peft_model_id)
  ```

### At inference time ###

  ```python
    def get_question(context, answer):
        device = next(L_model.parameters()).device
        input_text = f"Given the context '{context}' and the answer '{answer}', what question can be asked?"
        encoding = G_tokenizer.encode_plus(input_text, padding=True, return_tensors="pt").to(device)
    
        output_tokens = L_model.generate(**encoding, early_stopping=True, num_beams=5, num_return_sequences=1, no_repeat_ngram_size=2, max_length=100)
        out = G_tokenizer.decode(output_tokens[0], skip_special_tokens=True).replace("question:", "").strip()
    
        return out
  ```

### Training parameters and hyperparameters ###

    The following were used during training:
  
    # For Lora:
  
    r=18
  
    alpha=8


    # For training arguments:
  
    gradient_accumulation_steps=24
    
    per_device_train_batch_size=8
    
    per_device_eval_batch_size=8
    
    max_steps=1000
    
    warmup_steps=50
    
    weight_decay=0.05
    
    learning_rate=3e-3
    
    lr_scheduler_type="linear"

### Training Results ###

    | Epoch | Training Loss | Validation Loss |
    |-------|---------------|-----------------|
    | 0.0   | 4.6426        | 4.704238        |
    | 3.0   | 1.5094        | 1.202135        |
    | 6.0   | 1.2677        | 1.146177        |
    | 9.0   | 1.2613        | 1.112074        |
    | 12.0  | 1.1958        | 1.109059        |