metadata

license: apache-2.0

#llama-3 #experimental #work-in-progress

GGUF-IQ-Imatrix quants for @jeiku's ResplendentAI/SOVL_Llama3_8B.
Give them some love!

Updated! These quants have been redone with the fixes from llama.cpp/pull/6920 in mind.

Well...!
Turns out it was not just a hallucination and this model actually is pretty cool so give it a chance!
For 8GB VRAM GPUs, I recommend the Q4_K_M-imat quant for up to 12288 context sizes.

Compatible SillyTavern presets here (simple) or here (Virt's).
Use the latest version of KoboldCpp. Use the provided presets.