More quants c:

#1
by lGodZiol - opened

Any chance for a 3.0bpw or 2.75bpw quant? I'm missing about 500mb of Vram to fit this little beast into my 16 GB at 16k context and q8 cache quantization.

Yeah, I suppose I could make that happen. It's going to be a minute though. I just started quantizing a 123b to exl2, and it's saying it's going to take about 16hrs... πŸ’€

Thanks a lot! I appreciate it.

@lGodZiol 2.75bpw weights just finished uploading. Enjoy!

MarsupialAI changed discussion status to closed

Sign up or log in to comment