one gguf model
I ran llama cpp conversion to get a GGUF model to test on CPU
https://huggingface.co/andrew-cartwheel/snorkel-mistral-pairRM-DPO-q8_0.gguf
I ran llama cpp conversion to get a GGUF model to test on CPU
https://huggingface.co/andrew-cartwheel/snorkel-mistral-pairRM-DPO-q8_0.gguf
Hi Andrew,
Could you make 4Q M ver ?
If not could you at least gimme a formula on how to do this ? (Step by step as i completely have no idea how to do this)
Q8 is extremely slow .
Absolutely!
Here is a step by step guide https://github.com/ggerganov/llama.cpp/discussions/2948
Every step is outlined there, but if you run into trouble please let me know
Alternatively, it looks like another user uploaded more quants
https://huggingface.co/brittlewis12/Snorkel-Mistral-PairRM-DPO-GGUF/tree/main
Fantastic ! Thanks a lot !
Added both andrew-cartwheel or brittlewis12 models to the model card. Thanks all! Appreciate it!