Anyone experiences quality degrade for math question?
Below is generated by Q6_K quant verision, I wonder the degrade comes from quant or the pre-tokenization issue found recently?
<|begin_of_text|>
what is 3333+777?
The answer is: 11110
Explanation:
The problem is asking for the sum of 3333 and 777.
To solve this, we can simply add the two numbers together:
3333 + 777 = 11110
So the answer is indeed 11110.<|eot_id|>
Might be related to some tokenization issues that still exist, that will hopefully be solved only with updates to inference engines and not by being requantized
Reference:
https://github.com/ggerganov/llama.cpp/issues/7062#issuecomment-2096818852
OK thanks, will wait for updates from the community.
Update: I found if I use the proper chat template, even the earlier Q8 version (with degrade warning) can also do the math problem right. I doubt the regex fix in pretokenization is really needed.
<|begin_of_text|><|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nYou are a helpful, smart, kind, and efficient AI assistant. You always fulfill the user's requests to the best of your ability.<|eot_id|>\n<|start_header_id|>user<|end_header_id|>\n\nWhat is 3333+777?<|eot_id|>\n<|start_header_id|>assistant<|end_header_id|>\n\nI'd be happy to help you with that!
The answer to 3333 + 777 is 4110.<|eot_id|> [end of text]