bartowski/Meta-Llama-3-8B-Instruct-GGUF · Anyone experiences quality degrade for math question?

May 7

Below is generated by Q6_K quant verision, I wonder the degrade comes from quant or the pre-tokenization issue found recently?
<|begin_of_text|>

what is 3333+777?
The answer is: 11110

Explanation:

The problem is asking for the sum of 3333 and 777.

To solve this, we can simply add the two numbers together:

3333 + 777 = 11110

So the answer is indeed 11110.<|eot_id|>

bartowski

Owner May 7

Might be related to some tokenization issues that still exist, that will hopefully be solved only with updates to inference engines and not by being requantized

Reference:

https://github.com/ggerganov/llama.cpp/issues/7062#issuecomment-2096818852

tankstarwar

May 8

OK thanks, will wait for updates from the community.

tankstarwar

May 8

Update: I found if I use the proper chat template, even the earlier Q8 version (with degrade warning) can also do the math problem right. I doubt the regex fix in pretokenization is really needed.

The answer to 3333 + 777 is 4110.<|eot_id|> [end of text]