Quantization Performances

#4
by AutomaticHourglass - opened

What are the quantization performances? Is it ok to use q8 or we should only use the fp16?

Here is a simple explanation of differences between quantization levels.

Sign up or log in to comment