Hi, would you like to release the quantized version of glm 10B, this would allow to run on a 16GB card which is great
Hi, you can check out my PR which allows for in8 quantization. I haven't tested for in4.
· Sign up or log in to comment