readme: daily update
Browse files
README.md
CHANGED
@@ -84,9 +84,10 @@ Note: Use iMatrix quants only if you can fully offload to GPU, otherwise speed w
|
|
84 |
| Quant | Status | Size | Description | KV Metadata | Weighted | Notes |
|
85 |
|----------|-------------|-----------|--------------------------------------------|-------------|----------|-------|
|
86 |
| BF16 | Available | 439 GB | Lossless :) | Old | No | Q8_0 is sufficient for most cases |
|
87 |
-
| Q8_0 |
|
|
|
88 |
| Q4_K_M | Available | 132 GB | Medium quality *recommended* | Old | No | |
|
89 |
-
| Q3_K_M |
|
90 |
| IQ3_XS | Available | 89.6 GB | Better than Q3_K_M | Old | Yes | |
|
91 |
| Q2_K | Available | 80.0 GB | Low quality **not recommended** | Old | No | |
|
92 |
| IQ2_XXS | Available | 61.5 GB | Lower quality **not recommended** | Old | Yes | |
|
@@ -97,8 +98,9 @@ Note: Use iMatrix quants only if you can fully offload to GPU, otherwise speed w
|
|
97 |
|
98 |
| Planned Quant | Notes |
|
99 |
|-------------------|---------|
|
100 |
-
|
|
101 |
-
|
|
|
|
102 |
| Q6_K | |
|
103 |
| IQ4_XS | |
|
104 |
| IQ2_XS | |
|
@@ -116,9 +118,9 @@ deepseek2.leading_dense_block_count=int:1
|
|
116 |
deepseek2.rope.scaling.yarn_log_multiplier=float:0.0707
|
117 |
```
|
118 |
|
119 |
-
|
120 |
|
121 |
-
A precompiled AVX2 version is avaliable at `llama.cpp-039896407afd40e54321d47c5063c46a52da3e01.zip` in the root of this repo.
|
122 |
|
123 |
# License:
|
124 |
- DeepSeek license for model weights, which can be found in the `LICENSE` file in the root of this repo
|
@@ -128,7 +130,7 @@ A precompiled AVX2 version is avaliable at `llama.cpp-039896407afd40e54321d47c50
|
|
128 |
*~1.5t/s* with Ryzen 3 3700x (96gb 3200mhz) `[Q2_K]`
|
129 |
|
130 |
# iMatrix:
|
131 |
-
Find `imatrix.dat` in the root of this repo, made with a `Q2_K` quant (see here for info: [https://github.com/ggerganov/llama.cpp/issues/5153#issuecomment-1913185693](https://github.com/ggerganov/llama.cpp/issues/5153#issuecomment-1913185693))
|
132 |
|
133 |
Using `groups_merged.txt`, find it here: [https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384)
|
134 |
|
|
|
84 |
| Quant | Status | Size | Description | KV Metadata | Weighted | Notes |
|
85 |
|----------|-------------|-----------|--------------------------------------------|-------------|----------|-------|
|
86 |
| BF16 | Available | 439 GB | Lossless :) | Old | No | Q8_0 is sufficient for most cases |
|
87 |
+
| Q8_0 | Available | 233.27 GB | High quality *recommended* | Updated | Yes | |
|
88 |
+
| Q5_K_M | Uploading | 155 GB | Medium-low quality | Updated | Yes | |
|
89 |
| Q4_K_M | Available | 132 GB | Medium quality *recommended* | Old | No | |
|
90 |
+
| Q3_K_M | Available | 104 GB | Medium-low quality | Updated | Yes | |
|
91 |
| IQ3_XS | Available | 89.6 GB | Better than Q3_K_M | Old | Yes | |
|
92 |
| Q2_K | Available | 80.0 GB | Low quality **not recommended** | Old | No | |
|
93 |
| IQ2_XXS | Available | 61.5 GB | Lower quality **not recommended** | Old | Yes | |
|
|
|
98 |
|
99 |
| Planned Quant | Notes |
|
100 |
|-------------------|---------|
|
101 |
+
| Q5_K_S | |
|
102 |
+
| Q4_K_S | |
|
103 |
+
| Q3_K_S | |
|
104 |
| Q6_K | |
|
105 |
| IQ4_XS | |
|
106 |
| IQ2_XS | |
|
|
|
118 |
deepseek2.rope.scaling.yarn_log_multiplier=float:0.0707
|
119 |
```
|
120 |
|
121 |
+
Quants with "Updated" metadata contain these parameters, so as long as you're running a supported build of llama.cpp no `--override-kv` parameters are required.
|
122 |
|
123 |
+
A precompiled Windows AVX2 version is avaliable at `llama.cpp-039896407afd40e54321d47c5063c46a52da3e01.zip` in the root of this repo.
|
124 |
|
125 |
# License:
|
126 |
- DeepSeek license for model weights, which can be found in the `LICENSE` file in the root of this repo
|
|
|
130 |
*~1.5t/s* with Ryzen 3 3700x (96gb 3200mhz) `[Q2_K]`
|
131 |
|
132 |
# iMatrix:
|
133 |
+
Find `imatrix.dat` in the root of this repo, made with a `Q2_K` quant containing 62 chunks (see here for info: [https://github.com/ggerganov/llama.cpp/issues/5153#issuecomment-1913185693](https://github.com/ggerganov/llama.cpp/issues/5153#issuecomment-1913185693))
|
134 |
|
135 |
Using `groups_merged.txt`, find it here: [https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384)
|
136 |
|