GGML vs GGUF vs GPTQ

by HemanthSai7 - opened Aug 28, 2023

Aug 28, 2023

I'm new to quantization stuff. It'd be very helpful if you could explain the difference between these three types. Even a blog would be helpful. Thanks

mp3pintyo

Aug 28, 2023

GPTQ is a specific format for GPU only.

GGML is designed for CPU and Apple M series but can also offload some layers on the GPU

GGUF: https://github.com/philpax/ggml/blob/gguf-spec/docs/gguf.md
GGUF is a new format introduced by the llama.cpp team on August 21st 2023. It is a replacement for GGML, which is no longer supported by llama.cpp.

HemanthSai7

Aug 29, 2023

Thank you!

HemanthSai7 changed discussion status to closed Aug 29, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment