ysharma/Explore_llamav2_with_TGI · Make prompt fully compliant with spec

Make prompt fully compliant with spec37777987

pcuenq

Jul 21, 2023

cc @ysharma , @ArthurZ , @osanseviero

Remove initial bos099e2eee

federicomagnolfi-artificialy

Jul 25, 2023

Why this prompt is slightly different from this https://huggingface.co/spaces/huggingface-projects/llama-2-13b-chat/blob/5b351de4c5dc896f73ccf93d5fc1450787d48298/model.py#L26 ? Which is the correct one? Notice the additional <s> in the beginning

pcuenq

Jul 25, 2023

•

edited Jul 25, 2023

Hi @federicomagnolfi-artificialy ! Great question!

<s> is the "beginning of sequence" token, a.k.a. bos. It is usually added by the tokenizer, along with the end token (eos or </s>). However, since the location of the bos and eos tokens in the system prompt has to be handled with care, we decided to make it fully explicit in that example and in the blog post. Note that in the code you posted the tokenizer is invoked with add_special_tokens=False, meaning that we are handling the special tokens ourselves and the tokenizer should not do anything about them.

In this example, the tokenizer lives in the server (we are using a text generation interface endpoint), so we can't use add_special_tokens=False. We removed the initial <s> because it will be added in the server.

This can be quite confusing, the important thing is to pay a lot of attention as you did here :)

federicomagnolfi-artificialy

Jul 25, 2023

Hi @pcuenq , I didn't notice the add_special_tokens difference, thank you for the clarification!