Missing pre-tokenizer type
When loading with llama-cpp I get the warning:
llm_load_vocab: missing pre-tokenizer type, using: 'default'
llm_load_vocab:
llm_load_vocab: ************************************
llm_load_vocab: GENERATION QUALITY WILL BE DEGRADED!
llm_load_vocab: CONSIDER REGENERATING THE MODEL
llm_load_vocab: ************************************
Probably it was converted to gguf with old llama-cpp version.
Odd, I converted it with a version of llama.cpp which I cloned just yesterday. What version are you using?
Tried with both b0d943de179ad5dbd83d51f327fb566066f4ccda
and 947d3ad27d94f1addef76b5d64c314618f063933
(which is the latest master), and I see this message. Don't you have it as well?
I don't see it for this models, for instance: https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF
Probably this is the problem: https://github.com/ggerganov/llama.cpp/issues/7021 (if you used convert.py
instead of convert-hf-to-gguf.py
- convert.py
doesn't support llama 3 yet)
Agh, good to know. Will redo later with convert-hf-to-gguf.py
Corrected FP16 GGUFs available here
https://huggingface.co/failspy/Llama-3-70B-Instruct-abliterated-GGUF-fixed