|
--- |
|
license: apache-2.0 |
|
--- |
|
# #llama-3 #experimental #work-in-progress |
|
|
|
GGUF-IQ-Imatrix quants for @jeiku's [ResplendentAI/SOVL_Llama3_8B](https://huggingface.co/ResplendentAI/SOVL_Llama3_8B). <br> Give them some love! |
|
|
|
> [!IMPORTANT] |
|
> **Updated!** |
|
> These quants have been redone with the fixes from [llama.cpp/pull/6920](https://github.com/ggerganov/llama.cpp/pull/6920) in mind. <br> |
|
> Use **KoboldCpp version 1.64** or higher. |
|
|
|
> [!NOTE] |
|
> **Well...!** <br> |
|
> Turns out it was not just a hallucination and this model actually is pretty cool so **give it a chance!** <br> |
|
> For **8GB VRAM** GPUs, I recommend the **Q4_K_M-imat** quant for up to 12288 context sizes. |
|
|
|
> [!WARNING] |
|
> **Use the provided presets.** <br> |
|
> Compatible SillyTavern presets [here (simple)](https://huggingface.co/Lewdiculous/Model-Requests/tree/main/data/presets/cope-llama-3-0.1) or [here (Virt's roleplay)](https://huggingface.co/Virt-io/SillyTavern-Presets). |
|
> Use the latest version of KoboldCpp. |
|
|
|
![image/png](https://cdn-uploads.huggingface.co/production/uploads/626dfb8786671a29c715f8a9/N_1D87adbMuMlSIQ5rI3_.png) |