Has a massive repetition problem

#29

by Delcos - opened Oct 2, 2023

Oct 2, 2023

Even in the demo included on huggingface, it has a massive rep problem. If it's left to write out a document or anything of any complexity it usually tails off into repetition even with a higher penalty.

"My name is Teven and I am a 17 year old student from the Netherlands. I am currently in my last year of high school and I am planning to study computer science at the university. I have been interested in computers and technology for as long as I can remember. I have always been fascinated by the way computers work and how they can be used to solve problems.

I have been programming since I was 12 years old and I have always enjoyed it. I have used a variety of programming languages, including Python, Java, and C++. I have also used a variety of software development tools, including Eclipse, Visual Studio, and NetBeans.

I have always been interested in learning new things and I am always looking for new challenges. I am always looking for ways to improve my skills and I am always looking for new ways to use my skills. I am always looking for new ways to use my skills to help others.

I am always looking for new ways to use my skills to help others. I am always looking for new ways to use my skills to"

harpercarroll

Oct 2, 2023

I noticed this too. Increasing the temperature helps

NPap

Oct 10, 2023

Does it keep on generating text until it reaches the max_generated_tokens every time?

kypai

Oct 12, 2023

This comment has been hidden

adriata

Oct 12, 2023

I have problem with generating text until it reaches the max_generated_tokens on Tesla V100S GPU, but Colab A100 works ok. Any idea what can be done with this?

NPap

Oct 12, 2023

I have problem with generating text until it reaches the max_generated_tokens on Tesla V100S GPU, but Colab A100 works ok. Any idea what can be done with this?

You are running the exact same script + environment + component versions and you are getting different results with different gpus?

adriata

Oct 12, 2023

You are running the exact same script + environment + component versions and you are getting different results with different gpus?

Ok, I double checked this. The problem occurred with my qlora finetuned model. Base model works ok on both gpu.

nps798

Nov 7, 2023

@adriata did you figure out eventually how to deal with repetition problem with mistral qlora fine tuned model?
I also face this problem when I attempted to fine tuned on Chinese languages instruction dataset. no effective solution resolve my problem ...🥺

Rmote6603

Mar 26

Yeah, I am also running into the same issue even after increasing the temperature and penalty. Is this the same problem with Mistral-7b-v0.2 or with the Mistral-7b-instruct model?

adriata

Mar 27

Hi, remember to set PAD token != EOS during finetuning.

jdsannchao

Mar 28

Hi, remember to set PAD token != EOS during finetuning.

Hi could you explain more, thanks!

adriata

Mar 28

During fine tuning model can 'forget' about eos token when we set it as pad token. It's because of masking - we don't learn model to predict padding.
IDN what you use for fine tune, but setting this helped me with repetition:

tokenizer = AutoTokenizer.from_pretrained(
    base_model_id,
    model_max_length=4096,
    padding_side="left",
    add_eos_token=True # add eos at the end of text  
)

# tokenizer.pad_token = tokenizer.eos_token  # don't use this 
tokenizer.pad_token = tokenizer.unk_token  # use this
model.config.pad_token_id = tokenizer.pad_token_id

My dummy fine-tuning sample:
https://github.com/atadria/llm_calculator/blob/main/mistral_finetune.ipynb

Rmote6603

Apr 2

What value did you set for pad_token_id? Could you explain more on what the pad_token_id does?

Rmote6603

Apr 2

I also read this article: https://huggingface.co/NousResearch/Yarn-Mistral-7b-128k/discussions/3, but I am confused between what value to set for eos_token_id and pad_token_id. Has anyone played around with these two parameters and got the issue resolves? Please let me know.

salvagimeno

Apr 19

Hi @Rmote6603 did you manage to solver this issue? Facing the same problem myself. Thanks!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment