InferenceIllusionist
commited on
Commit
•
55b4921
1
Parent(s):
5ae74ce
Update README.md
Browse files
README.md
CHANGED
@@ -9,11 +9,13 @@ license: cc-by-nc-4.0
|
|
9 |
* Model creator: [Sao10K](https://huggingface.co/Sao10K/)
|
10 |
* Original model: [Fimbulvetr-11B-v2](https://huggingface.co/Sao10K/Fimbulvetr-11B-v2)
|
11 |
|
12 |
-
<b>Important: </b> Inferencing for newer formats
|
13 |
|
14 |
|
15 |
All credits to Sao10K for the original model. This is just a quick test of the new quantization types such as IQ_3S in an attempt to further reduce VRAM requirements.
|
16 |
|
|
|
|
|
17 |
Quantized from fp16 with love. Importance matrix file [Fimbulvetr-11B-v2-imatrix.dat](https://huggingface.co/InferenceIllusionist/Fimbulvetr-11B-v2-iMat-GGUF/blob/main/Fimbulvetr-11B-v2-imatrix.dat) was calculated using Q8_0.
|
18 |
|
19 |
|
|
|
9 |
* Model creator: [Sao10K](https://huggingface.co/Sao10K/)
|
10 |
* Original model: [Fimbulvetr-11B-v2](https://huggingface.co/Sao10K/Fimbulvetr-11B-v2)
|
11 |
|
12 |
+
<b>Important: </b> Inferencing for newer formats i.e. IQ1_S, IQ3_S, IQ4_NL tested on latest llama.cpp & koboldcpp v.1.59.1
|
13 |
|
14 |
|
15 |
All credits to Sao10K for the original model. This is just a quick test of the new quantization types such as IQ_3S in an attempt to further reduce VRAM requirements.
|
16 |
|
17 |
+
|
18 |
+
|
19 |
Quantized from fp16 with love. Importance matrix file [Fimbulvetr-11B-v2-imatrix.dat](https://huggingface.co/InferenceIllusionist/Fimbulvetr-11B-v2-iMat-GGUF/blob/main/Fimbulvetr-11B-v2-imatrix.dat) was calculated using Q8_0.
|
20 |
|
21 |
|