What kind of GPU need to run this model locally on-prem ?
#17
by
eliastick
- opened
I'd like to run this model on-premise . What hardware and GPU I need ? Thank you
I'd like to run this model on-premise . What hardware and GPU I need ? Thank you
Try GGUF quants in llama.cpp or kobold.cpp. I recommend llama.cpp, since I've experienced issues with kobold.cpp due to image resizing.
I can run GGUFs on Tesla P40, many people claim that they managed to fit 34b Q4 quants in 7900XTX, so 24GB VRAM is probably a minimum system requirement.