File size: 2,112 Bytes
0b4ca29
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
---
license: other
license_name: other
license_link: https://ai.meta.com/llama/license
---
Repairing the chat template for the model.
There is a slight problem with your chat template. If you train a model with that current chat template the model starts to output as the first token <|eot_id|> and naturally the script will then halt generation. the model learns to see this:

<|begin_of_text|><|start_header_id|>user<|end_header_id|>

<|image|>If I had to write a haiku for this one, it would be: <|eot_id|><|start_header_id|>assistant<|end_header_id|>

Here is a haiku for the image:

Rabbit in a coat
Dapper and dignified
Country cottage charm<|eot_id|>

and so the model learns to do this in its first output:
<|eot_id|><|start_header_id|>assistant<|end_header_id|>
which naturally messes up the training. can you please put a new line character after the eot_id or prior to the start header id in the chat template: so that the format is like so :
<|begin_of_text|><|start_header_id|>user<|end_header_id|>

<|image|>If I had to write a haiku for this one, it would be: <|eot_id|>

<|start_header_id|>assistant<|end_header_id|>

this results in a clearer distinction between the end of the user message and the start of the models.
<|begin_of_text|>
<|start_header_id|>system<|end_header_id|>

Today Date: 26 Sep 2024

You are a helpful language and vision assistant. You are able to understand the visual content that the user provides, and assist the user with a variety of tasks using natural language.<|eot_id|>
<|start_header_id|>user<|end_header_id|>

If I had to write a haiku for this one, it would be:<|eot_id|>#notice that this is sending the message to the next line now. which forms a clear distinction for the model. if you train a model with your current prompt it just outputs[ ]
<|start_header_id|>assistant<|end_header_id|>

['A rabbit on a sunny day']
this is an example of the 3.1 models chat template. i have not examined your one yet however i have examined the output of it above. to prevent the model learning that eot comes first there need to be a clearer distinction made with a \n