Post
1900
We release an empirical study to showcase "How Good Are Low-bit Quantized hashtag#LLaMA3 ๐ฆ Models" with existing LLM quantization techniques!
In this study, the performance of the low-bit LLaMA3 models (especially LLaMA3-70B) is impressively notable. ๐ However, the results also exposed significant performance degradation issues faced by existing quantization techniques when dealing with LLaMA3, especially under ultra-low bit-width.
We hope this study can serve as a reference for the LLM quantization community and promote the emergence of stronger LLM quantization methods in the context of LLaMA3's release. More work is on the way...
How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study (2404.14047)
https://huggingface.co/collections/LLMQ/llama3-quantization-66251258525135aeda16513c
In this study, the performance of the low-bit LLaMA3 models (especially LLaMA3-70B) is impressively notable. ๐ However, the results also exposed significant performance degradation issues faced by existing quantization techniques when dealing with LLaMA3, especially under ultra-low bit-width.
We hope this study can serve as a reference for the LLM quantization community and promote the emergence of stronger LLM quantization methods in the context of LLaMA3's release. More work is on the way...
How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study (2404.14047)
https://huggingface.co/collections/LLMQ/llama3-quantization-66251258525135aeda16513c