Update README.md
Browse files
README.md
CHANGED
@@ -18,19 +18,18 @@ tags:
|
|
18 |
|
19 |
# Quant Infos
|
20 |
|
|
|
|
|
|
|
21 |
- quants done with an importance matrix for improved quantization loss
|
22 |
-
- quantized & generated imatrix from the f32 as f16 is inaccurate when converting from bf16
|
23 |
- K & IQ quants in basically all variants from Q6_K down to IQ1_S
|
|
|
|
|
|
|
|
|
|
|
|
|
24 |
|
25 |
-
Quantized with [llama.cpp](https://github.com/ggerganov/llama.cpp) commit [b4e4b8a9351d918a56831c73cf9f25c1837b80d1](https://github.com/ggerganov/llama.cpp/commit/b4e4b8a9351d918a56831c73cf9f25c1837b80d1) (master from 2024-04-24)
|
26 |
-
|
27 |
-
Imatrix dataset was used from [here](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384)
|
28 |
-
|
29 |
-
Using this command to generate the importance matrix from the f32.gguf
|
30 |
-
|
31 |
-
```
|
32 |
-
./imatrix -c 512 -m $model_name-f16.gguf -f $llama_cpp_path/groups_merged.txt -o $out_path/imat-f16-gmerged.dat
|
33 |
-
```
|
34 |
|
35 |
# Original Model Card
|
36 |
|
|
|
18 |
|
19 |
# Quant Infos
|
20 |
|
21 |
+
## Includes latest bpe tokenizer fixes 🎉
|
22 |
+
|
23 |
+
- Updated for latest bpe pre-tokenizer fixes https://github.com/ggerganov/llama.cpp/pull/6920
|
24 |
- quants done with an importance matrix for improved quantization loss
|
|
|
25 |
- K & IQ quants in basically all variants from Q6_K down to IQ1_S
|
26 |
+
- fixed end token for instruct mode (<|eot_id|>[128009])
|
27 |
+
- Quantized with [llama.cpp](https://github.com/ggerganov/llama.cpp) commit [f4ab2a41476600a98067a9474ea8f9e6db41bcfa](https://github.com/ggerganov/llama.cpp/commit/f4ab2a41476600a98067a9474ea8f9e6db41bcfa) (master from 2024-04-29)
|
28 |
+
- Imatrtix generated with [this](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384) dataset.
|
29 |
+
```
|
30 |
+
./imatrix -c 512 -m $model_name-f16.gguf -f $llama_cpp_path/groups_merged.txt -o $out_path/imat-f16-gmerged.dat
|
31 |
+
```
|
32 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
33 |
|
34 |
# Original Model Card
|
35 |
|