mou3az's picture
Update README.md
f32742c verified
|
raw
history blame
2.62 kB
metadata
license: apache-2.0
datasets:
  - squad_v2
  - drop
  - mou3az/IT_QA-QG
language:
  - en
library_name: transformers
pipeline_tag: text2text-generation
tags:
  - IT purpose
  - General purpose
metrics:
  - bertscore

Model Card

#Information:

Base Model: facebook/bart-base

Fine-tuned : using PEFT-LoRa

Datasets : squad_v2, drop, mou3az/IT_QA-QG

Task: Generating questions from context and answers

Language: English

#Performance Metrics on Evaluation Set:

Training Loss: 1.1.1958

Evaluation Loss: 1.109059

Loading the model

  from peft import PeftModel, PeftConfig
  from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
  HUGGING_FACE_USER_NAME = "mou3az"
  model_name = "IT-General_Question-Generation "
  peft_model_id = f"{HUGGING_FACE_USER_NAME}/{model_name}"
  config = PeftConfig.from_pretrained(peft_model_id)
  model = AutoModelForSeq2SeqLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=False, device_map='auto')
  QG_tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
  QG_model = PeftModel.from_pretrained(model, peft_model_id)

At inference time

  def get_question(context, answer):
      device = next(L_model.parameters()).device
      input_text = f"Given the context '{context}' and the answer '{answer}', what question can be asked?"
      encoding = G_tokenizer.encode_plus(input_text, padding=True, return_tensors="pt").to(device)
  
      output_tokens = L_model.generate(**encoding, early_stopping=True, num_beams=5, num_return_sequences=1, no_repeat_ngram_size=2, max_length=100)
      out = G_tokenizer.decode(output_tokens[0], skip_special_tokens=True).replace("question:", "").strip()
  
      return out

Training parameters and hyperparameters

The following were used during training:

# For Lora:

r=18

alpha=8


# For training arguments:

gradient_accumulation_steps=24

per_device_train_batch_size=8

per_device_eval_batch_size=8

max_steps=1000

warmup_steps=50

weight_decay=0.05

learning_rate=3e-3

lr_scheduler_type="linear"

Training Results

| Epoch | Training Loss | Validation Loss |
|-------|---------------|-----------------|
| 0.0   | 4.6426        | 4.704238        |
| 3.0   | 1.5094        | 1.202135        |
| 6.0   | 1.2677        | 1.146177        |
| 9.0   | 1.2613        | 1.112074        |
| 12.0  | 1.1958        | 1.109059        |