What kind of GPU need to run this model locally on-prem ?

#17

by eliastick - opened May 29

Discussion

eliastick

May 29

I'd like to run this model on-premise . What hardware and GPU I need ? Thank you

princer0072

Aug 4

I'd like to run this model on-premise . What hardware and GPU I need ? Thank you

Try GGUF quants in llama.cpp or kobold.cpp. I recommend llama.cpp, since I've experienced issues with kobold.cpp due to image resizing.
I can run GGUFs on Tesla P40, many people claim that they managed to fit 34b Q4 quants in 7900XTX, so 24GB VRAM is probably a minimum system requirement.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment