Kooten
/

Euryale-1.4-L2-70B-IQ2-GGUF

Inference Endpoints

Model card Files Files and versions Community

Kooten commited on Jan 19

Commit

5434741

•

1 Parent(s): b874404

Update README.md

Files changed (1) hide show

README.md +5 -2

README.md CHANGED Viewed

@@ -11,9 +11,12 @@ IQ2-GGUF quants of [Sao10K/Euryale-1.4-L2-70B](https://huggingface.co/Sao10K/Eur
 Unlike regular GGUF quants this uses important matrix similar to Quip# to keep the quant from degrading too much even at 2bpw allowing you to run larger models on less powerful machines.
-***NOTE:*** As of uploading these this llamacpp can run these quants but i am unsure what guis like oobabooga / koboldcpp can run them.
-[More info](https://github.com/ggerganov/llama.cpp/pull/4897)
 # Models

 Unlike regular GGUF quants this uses important matrix similar to Quip# to keep the quant from degrading too much even at 2bpw allowing you to run larger models on less powerful machines.
+***NOTE:*** Currently you will need experimental branches of Koboldcpp or Ooba for this to work.
+- Nexesenex have compiled Windows binaries [HERE](https://github.com/Nexesenex/kobold.cpp/releases/tag/v1.55.1_b1842)
+- [llamacpp_0.2.29 branch](https://github.com/oobabooga/text-generation-webui/tree/llamacpp_0.2.29) of Ooba also works
+[More info about IQ2](https://github.com/ggerganov/llama.cpp/pull/4897)
 # Models