NeoChen1024
/

dolphin-2.7-mixtral-8x7b-GGUF-IQ3_M

Inference Endpoints

Model card Files Files and versions Community

NeoChen1024 commited on Sep 17

Commit

09716bf

•

1 Parent(s): 18b6153

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -4,5 +4,5 @@ base_model:
 - cognitivecomputations/dolphin-2.7-mixtral-8x7b
 ---
-GGUF IQ3_M quant of cognitivecomputations/dolphin-2.7-mixtral-8x7b (non-imatrix)
 It fits into 24GiB VRAM with 32768 context (@ 8bit KV cache quantization).

 - cognitivecomputations/dolphin-2.7-mixtral-8x7b
 ---
+GGUF IQ3_M quant of cognitivecomputations/dolphin-2.7-mixtral-8x7b (both non-imatrix and imatrix)
 It fits into 24GiB VRAM with 32768 context (@ 8bit KV cache quantization).