int4/8 quantization so that we can deploy on consumer-grid gpu card

by Yhyu13 - opened May 7, 2023

Yhyu13

May 7, 2023

Hi, would you like to release the quantized version of glm 10B, this would allow to run on a 16GB card which is great

Jun 19, 2023

Hi, you can check out my PR which allows for in8 quantization. I haven't tested for in4.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment