readme: fix kv overrides
Browse files
README.md
CHANGED
@@ -14,7 +14,7 @@ language:
|
|
14 |
- zh
|
15 |
---
|
16 |
|
17 |
-
#
|
18 |
|
19 |
Quantizised from [https://huggingface.co/deepseek-ai/DeepSeek-V2-Chat](https://huggingface.co/deepseek-ai/DeepSeek-V2-Chat)
|
20 |
|
@@ -64,7 +64,7 @@ imatrix \
|
|
64 |
-f groups_merged.txt \
|
65 |
--verbosity [0, 1, 2] \
|
66 |
-ngl {GPU offloading; must build with CUDA} \
|
67 |
-
|
68 |
```
|
69 |
Making a quant:
|
70 |
```
|
@@ -111,7 +111,7 @@ deepseek2.attention.q_lora_rank=int:1536
|
|
111 |
deepseek2.attention.kv_lora_rank=int:512
|
112 |
deepseek2.expert_shared_count=int:2
|
113 |
deepseek2.expert_feed_forward_length=int:1536
|
114 |
-
deepseek2.
|
115 |
deepseek2.leading_dense_block_count=int:1
|
116 |
deepseek2.rope.scaling.yarn_log_multiplier=float:0.0707
|
117 |
```
|
|
|
14 |
- zh
|
15 |
---
|
16 |
|
17 |
+
# DeepSeek-V2-Chat-GGUF
|
18 |
|
19 |
Quantizised from [https://huggingface.co/deepseek-ai/DeepSeek-V2-Chat](https://huggingface.co/deepseek-ai/DeepSeek-V2-Chat)
|
20 |
|
|
|
64 |
-f groups_merged.txt \
|
65 |
--verbosity [0, 1, 2] \
|
66 |
-ngl {GPU offloading; must build with CUDA} \
|
67 |
+
--ofreq {recommended: 1}
|
68 |
```
|
69 |
Making a quant:
|
70 |
```
|
|
|
111 |
deepseek2.attention.kv_lora_rank=int:512
|
112 |
deepseek2.expert_shared_count=int:2
|
113 |
deepseek2.expert_feed_forward_length=int:1536
|
114 |
+
deepseek2.expert_weights_scale=float:16
|
115 |
deepseek2.leading_dense_block_count=int:1
|
116 |
deepseek2.rope.scaling.yarn_log_multiplier=float:0.0707
|
117 |
```
|