mylesgoose commited on
Commit
0b4ca29
1 Parent(s): 831cec6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +41 -5
README.md CHANGED
@@ -1,5 +1,41 @@
1
- ---
2
- license: other
3
- license_name: other
4
- license_link: https://ai.meta.com/llama/license
5
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: other
4
+ license_link: https://ai.meta.com/llama/license
5
+ ---
6
+ Repairing the chat template for the model.
7
+ There is a slight problem with your chat template. If you train a model with that current chat template the model starts to output as the first token <|eot_id|> and naturally the script will then halt generation. the model learns to see this:
8
+
9
+ <|begin_of_text|><|start_header_id|>user<|end_header_id|>
10
+
11
+ <|image|>If I had to write a haiku for this one, it would be: <|eot_id|><|start_header_id|>assistant<|end_header_id|>
12
+
13
+ Here is a haiku for the image:
14
+
15
+ Rabbit in a coat
16
+ Dapper and dignified
17
+ Country cottage charm<|eot_id|>
18
+
19
+ and so the model learns to do this in its first output:
20
+ <|eot_id|><|start_header_id|>assistant<|end_header_id|>
21
+ which naturally messes up the training. can you please put a new line character after the eot_id or prior to the start header id in the chat template: so that the format is like so :
22
+ <|begin_of_text|><|start_header_id|>user<|end_header_id|>
23
+
24
+ <|image|>If I had to write a haiku for this one, it would be: <|eot_id|>
25
+
26
+ <|start_header_id|>assistant<|end_header_id|>
27
+
28
+ this results in a clearer distinction between the end of the user message and the start of the models.
29
+ <|begin_of_text|>
30
+ <|start_header_id|>system<|end_header_id|>
31
+
32
+ Today Date: 26 Sep 2024
33
+
34
+ You are a helpful language and vision assistant. You are able to understand the visual content that the user provides, and assist the user with a variety of tasks using natural language.<|eot_id|>
35
+ <|start_header_id|>user<|end_header_id|>
36
+
37
+ If I had to write a haiku for this one, it would be:<|eot_id|>#notice that this is sending the message to the next line now. which forms a clear distinction for the model. if you train a model with your current prompt it just outputs[ ]
38
+ <|start_header_id|>assistant<|end_header_id|>
39
+
40
+ ['A rabbit on a sunny day']
41
+ this is an example of the 3.1 models chat template. i have not examined your one yet however i have examined the output of it above. to prevent the model learning that eot comes first there need to be a clearer distinction made with a \n