Quantise to support llama.cpp
#1
by
TusharRay
- opened
Can this be quantised to support https://github.com/ggerganov/llama.cpp ? The llama.cpp is really performant and this model can then be widely used across multiple platforms!