GGML vs GGUF vs GPTQ
#2
by
HemanthSai7
- opened
I'm new to quantization stuff. It'd be very helpful if you could explain the difference between these three types. Even a blog would be helpful. Thanks
GPTQ is a specific format for GPU only.
GGML is designed for CPU and Apple M series but can also offload some layers on the GPU
GGUF: https://github.com/philpax/ggml/blob/gguf-spec/docs/gguf.md
GGUF is a new format introduced by the llama.cpp team on August 21st 2023. It is a replacement for GGML, which is no longer supported by llama.cpp.
Thank you!
HemanthSai7
changed discussion status to
closed