Spaces:
Running
on
CPU Upgrade
Make prompt fully compliant with spec
cc @ysharma , @ArthurZ , @osanseviero
Why this prompt is slightly different from this https://huggingface.co/spaces/huggingface-projects/llama-2-13b-chat/blob/5b351de4c5dc896f73ccf93d5fc1450787d48298/model.py#L26 ? Which is the correct one? Notice the additional <s>
in the beginning
Hi @federicomagnolfi-artificialy ! Great question!
<s>
is the "beginning of sequence" token, a.k.a. bos
. It is usually added by the tokenizer, along with the end token (eos
or </s>
). However, since the location of the bos
and eos
tokens in the system prompt has to be handled with care, we decided to make it fully explicit in that example and in the blog post. Note that in the code you posted the tokenizer is invoked with add_special_tokens=False
, meaning that we are handling the special tokens ourselves and the tokenizer should not do anything about them.
In this example, the tokenizer lives in the server (we are using a text generation interface endpoint), so we can't use add_special_tokens=False
. We removed the initial <s>
because it will be added in the server.
This can be quite confusing, the important thing is to pay a lot of attention as you did here :)
Hi
@pcuenq
, I didn't notice the add_special_tokens
difference, thank you for the clarification!