Kindly, please help me understand the conversation template format
Hi. There is a lot of confusing info floating around about prompt formats and llama2 ( from a beginner perspective at least)
From what I understand, Llama2 was trained with a specific conversation "flow" format, utilising SYS/ INS tags, etc.
The description of this model suggests a different "template".
Does this mean, this particular model overrides the training of base llama2 model, and those SYS etc tags are not required ?
Or does this mean, both "templates" will work, and llama2 template will "trigger" responses coming from the base model, and suggested here "### human:" etc prompts will trigger responses utilising what this particular model adds on top of base llama2 model ?
Is it possible/necessary to use both of those prompt formats at the same time/interchangeably ?
I'm trying to construct a UI that would allow for conversation with the AI as well as an imaginary character that AI will impersonate, in the same conversation, allowing the user to refer to either the AI or the character.
So far, I'm getting stuck in a situation where the conversation gets off the rails and model starts to produce output incoherent with the conversation, talking to itself, spawning additional characters, or even worse, perceiving conversation template tags as a part of the conversation and inventing random keywords as if it's a code snippet to be completed.
I would greatly appreciate if anybody could clarify this "conversation template" situation.
Only the llama2 chat format was trained on that specific format, community tuners can pick their own they think is best.
So for each model check the model card for the most suitable format.
Yes I understand that, but I would like to understand if a tuned model adds its own format on top of the base format, or it replaces the base format completely with its own.
The base model that typically is being used does not have this format, only the meta tuned chat version. So its not so much a matter of replacing as it is of just doing the same thing differently.
Oh, so this model is based on "base" llama2, and only llama2-chat uses its weird prompt format, therefore this particular model achieves its "chatness" by imprinting its own specific format on a model that was "formatless" to begin with?
Am I understanding this right ?
Correct yes.
Thank you very much for this clarification, it is very helpful.