Edit model card

GGUF IQ3_M quant of cognitivecomputations/dolphin-2.7-mixtral-8x7b (both non-imatrix and imatrix)
It fits into 24GiB VRAM with 32768 context (@ 8bit KV cache quantization).

Downloads last month
33
GGUF
Model size
46.7B params
Architecture
llama
Inference API
Unable to determine this model's library. Check the docs .

Model tree for NeoChen1024/dolphin-2.7-mixtral-8x7b-GGUF-IQ3_M

Quantized
(8)
this model