Lewdiculous
/

SOVL_Llama3_8B-GGUF-IQ-Imatrix

Inference Endpoints

Model card Files Files and versions Community

Lewdiculous commited on May 1

Commit

f72d954

•

1 Parent(s): 8cf2dd9

Update README.md

Files changed (1) hide show

README.md +4 -3

README.md CHANGED Viewed

@@ -8,7 +8,7 @@ GGUF-IQ-Imatrix quants for @jeiku's [ResplendentAI/SOVL_Llama3_8B](https://huggi
 > [!IMPORTANT]
 > **Updated!**
 > These quants have been redone with the fixes from [llama.cpp/pull/6920](https://github.com/ggerganov/llama.cpp/pull/6920) in mind. <br>
-> Use **KoboldCpp version 1.64** or higher.
 > [!NOTE]
 > **Well...!** <br>
@@ -16,7 +16,8 @@ GGUF-IQ-Imatrix quants for @jeiku's [ResplendentAI/SOVL_Llama3_8B](https://huggi
 > For **8GB VRAM** GPUs, I recommend the **Q4_K_M-imat** quant for up to 12288 context sizes.
 > [!WARNING]
-> Compatible SillyTavern presets [here (simple)](https://huggingface.co/Lewdiculous/Model-Requests/tree/main/data/presets/cope-llama-3-0.1) or [here (Virt's)](https://huggingface.co/Virt-io/SillyTavern-Presets). <br>
-> Use the latest version of KoboldCpp. **Use the provided presets.** <br>
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/626dfb8786671a29c715f8a9/N_1D87adbMuMlSIQ5rI3_.png)

 > [!IMPORTANT]
 > **Updated!**
 > These quants have been redone with the fixes from [llama.cpp/pull/6920](https://github.com/ggerganov/llama.cpp/pull/6920) in mind. <br>
+> Use **KoboldCpp version 1.64 (coming soon)** or higher.
 > [!NOTE]
 > **Well...!** <br>
 > For **8GB VRAM** GPUs, I recommend the **Q4_K_M-imat** quant for up to 12288 context sizes.
 > [!WARNING]
+> **Use the provided presets.** <br>
+> Compatible SillyTavern presets [here (simple)](https://huggingface.co/Lewdiculous/Model-Requests/tree/main/data/presets/cope-llama-3-0.1) or [here (Virt's roleplay)](https://huggingface.co/Virt-io/SillyTavern-Presets).
+> Use the latest version of KoboldCpp.
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/626dfb8786671a29c715f8a9/N_1D87adbMuMlSIQ5rI3_.png)