Will it be converted to ggml q4?
#1
by
ai2p
- opened
to run with llama.cpp
ai2p
changed discussion title from
Will me ggml q4 version?
to Will it be converted to ggml q4?
I've done a GPTQ 4bit version for GPU inference here: https://huggingface.co/TheBloke/medalpaca-13B-GPTQ-4bit
Tomorrow I'll look at GGMLs also
I tried to convert with llama.cpp utils but get an error.
Oh yeah I forgot about this. I'll see what I can do.
Sorry. Finally I did it just removing the two optimizer/scheduler .pt
I have done a set of GGMLs here, using latest llama.cpp: https://huggingface.co/TheBloke/medalpaca-13B-GGML