add_special_tokens=False results in poor generation

#80

by DMaksimov - opened Mar 24

Discussion

DMaksimov

Mar 24

•

edited Mar 24

Hi!

I recently ran an experiment from a model card using a chat template and encountered some issues. In the first attempt, as shown in the attached image, the results were not satisfactory.

However, when I modified the settings to include the token by setting add_special_tokens=True, the outcome improved significantly:

Could you please explain the rationale behind using add_special_tokens=False in this example?

lkv

Google org Aug 14

•

edited Aug 14

Hi @DMaksimov , If we were preparing inputs for standard tasks like text classification, text generation, or translation using pre-trained models without any customized processing, we would typically set
add_special_tokens=True to ensure the input is in the format the model expects.
add_special_tokens=False is used here likely because the we wants to control how the input tokens are processed, either to meet specific model requirements or to handle the special tokens manually. Kindly refer this link for more information . Thank you.

Renu11

Google org Aug 14

This comment has been hidden

tanliboy

Aug 14

Based on the testing, can we conclude that this model is sensitive to <bos> token?

While evaluating Gemma-2 model in the evaluation harness lib, I also saw a comment from the output saying that without <bos> token can significantly impact the performance of Gemma models.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment