HaotongQin (Haotong Qin)

Posts 1

Post

1900

We release an empirical study to showcase "How Good Are Low-bit Quantized hashtag#LLaMA3 🦙 Models" with existing LLM quantization techniques!

In this study, the performance of the low-bit LLaMA3 models (especially LLaMA3-70B) is impressively notable. 🚀 However, the results also exposed significant performance degradation issues faced by existing quantization techniques when dealing with LLaMA3, especially under ultra-low bit-width.

We hope this study can serve as a reference for the LLM quantization community and promote the emergence of stronger LLM quantization methods in the context of LLaMA3's release. More work is on the way...

How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study (2404.14047)

https://huggingface.co/collections/LLMQ/llama3-quantization-66251258525135aeda16513c

Papers 2

arxiv:2404.14047

arxiv:2402.04291

models

None public yet

datasets

None public yet

Haotong Qin

AI & ML interests

Organizations

Posts 1

Papers 2

models

datasets