Trainer/SFT Trainer. Showing learning curves
Hi thanks for the repo. Did you use Trainer or SFT Trainer class for this? Also would you mind sharing tensorboard learning curves? I am trying to recreate your model but using Qlora 4 bit but the model seems to overfit early. Another question is that. Have you encounter problems during inference?
I have a problem that model keeps generating another responses even after providing valid answer in the first sentence. Here is an example
Ceremonia otwarcia Letnich Igrzysk Olimpijskich 2024 w Paryżu była kontrowersyjna ze względu na odtworzenie obrazu Leonarda da Vinci Ostatnia Wieczerza przez drag queens. \n\nCeremonia otwarcia Letnich Igrzysk Olimpijskich 2024 w Paryżu była kontrowersyjna ze względu na odtworzenie obrazu Leonarda da Vinci Ostatnia Wieczerza przez drag queens. \n\nCeremonia otwarcia Letnich Igrzysk Olimpijskich 2024 w Paryżu była kontrowersyjna ze względu na odtworzenie obrazu Leonarda da Vinci Ostatnia Wieczerza przez drag queens. \n\nCeremonia otwarcia Letnich Igrzysk Olimpijskich 2024 w Paryżu była kontrowersyjna ze względu na odtworzenie obrazu Leonarda da Vinci Ostatnia Wieczerza przez drag queens. \n\nCeremonia ot
Hello.
I used SFTTrainer with these arguments:
max_seq_length = 4096
trainer = SFTTrainer(
model=model,
args=args,
train_dataset=dataset,
max_seq_length=max_seq_length,
packing=True,
dataset_kwargs={
"add_special_tokens": False,
"append_concat_token": False
}
)
Also I am attaching some curves from tensorboard
I am not sure about repetition problems. Are you using 4-bit for inference?
Thanks a lot for learning curves and parameters! I don't use 4 bit for inference but I am training this SFTTrainer with unsloth to speed up training, maybe that's the case. Did you do something with the , or other tokens during finetunning, or you just left them as they are in original Qra-7b model tokenizer?
Based on this article: https://www.philschmid.de/fine-tune-llms-in-2024-with-trl
this is my script for dataset preparation:
from datasets import load_dataset
system_message = """Jesteś przyjaznym chatbotem"""
def create_conversation(sample) -> dict:
strip_characters = "\"'"
return {
"messages": [
{"role": "system", "content": system_message},
{"role": "user",
"content": f"{sample['instruction'].strip(strip_characters)} "
f"{sample['input'].strip(strip_characters)}"},
{"role": "assistant",
"content": f"{sample['output'].strip(strip_characters)}"}
]
}
dataset = load_dataset("s3nh/alpaca-dolly-instruction-only-polish", split="train")
dataset = dataset.shuffle(seed=42)
dataset = dataset.map(create_conversation,
remove_columns=dataset.features, batched=False)
dataset = dataset.train_test_split(0.1)
dataset["train"].to_json("train_dataset.json", orient="records")
dataset["test"].to_json("test_dataset.json", orient="records")
and the tokenizer in training script:
model_id = "Qra-7b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
tokenizer.padding_side = "right"
tokenizer.add_special_tokens({"pad_token": "[PAD]"})