4-bit quant?

#3
by Neman - opened

Hi! Thank you for releasing this multimodal model. First test are impressive. Even 1.3B is good for its size.
It is just that 7b version in full precision is still taxing on personal HW we have at home.
Would it be possible to quantize it to int4 like Qwen did with their Qwen-VL-Chat-Int4?
I think it would be best if you could do it and put it here in your repo so community can use it.
If not, maybe you could give us some guidelines how to do it.

@Neman
We don't have plans to work on quantizing the model; you might want to wait for other community members to tackle it.

Thank you for answer.

Neman changed discussion status to closed

Sign up or log in to comment