More quants c:
#1
by
lGodZiol
- opened
Any chance for a 3.0bpw or 2.75bpw quant? I'm missing about 500mb of Vram to fit this little beast into my 16 GB at 16k context and q8 cache quantization.
Yeah, I suppose I could make that happen. It's going to be a minute though. I just started quantizing a 123b to exl2, and it's saying it's going to take about 16hrs... π
Thanks a lot! I appreciate it.
MarsupialAI
changed discussion status to
closed