What GPU recommendations for this?
#1
by
hazrpg
- opened
Hey, thanks so much for releasing these.
I currently have a 3060 12GB and it fails to load - it fills up the available 11GB (presumably ~1GB is used for the OS) and then pytorch errors out with out of memory.
I might try the GGUF versions for now - but I'm curious what your recommendations are for this model, how much vRAM, etc.
It should only take like 6gb vram? Infact you could probably load a 13b model as well. What are you using to load the model?
If you're using vLLM, it pre-computes and reserves the necessary memory for cache and prompt conversions. So, a 6gb model may actually reserve >12gb of vram.