|
--- |
|
base_model: [deepseek-ai/DeepSeek-V2-Chat-0628] |
|
--- |
|
|
|
#### ๐ Custom quantizations of DeepSeek-V2-Chat-0628 supercharged for CPU inference of currently the #7 model globally on lmsys arena hard! ๐ฅ๏ธ |
|
|
|
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6379683a81c1783a4a2ddba8/rbdug3j6BaeTSmKLDIp39.png) |
|
|
|
### ๐ง This IQ4XM version uses GGML TYPE IQ_4_XS 4bit in combination with q8_0 bit for blazing fast performance with minimal loss, leveraging int8 optimizations on most newer server CPUs. |
|
### ๐ ๏ธ While it required some custom code wizardry, it's fully compatible with standard llama.cpp from GitHub or just search for nisten in lmstudio. |
|
|
|
>[!TIP] |
|
>๐ฅ The following 4-bit version is my personal go-to, delivering jaw-dropping performance on ARM cores. |
|
> |
|
>๐ No need for file concatenation - just point llama-cli at the first file and watch the magic happen! |
|
> |
|
>๐ป Ready to delve in baby? Here's your command-line spell for interactive mode (prompt.txt is optional, but recommended for maximum sorcery): |
|
>```bash |
|
>./llama-cli --temp 0.4 -m deepseek_0628_cpu_optimized_iq4xm-00001-of-00004.gguf -c 32000 -co -cnv -i -f prompt.txt |
|
>``` |
|
|
|
```verilog |
|
deepseek_0628_cpu_optimized_iq4xm-00001-of-00004.gguf |
|
deepseek_0628_cpu_optimized_iq4xm-00002-of-00004.gguf |
|
deepseek_0628_cpu_optimized_iq4xm-00003-of-00004.gguf |
|
deepseek_0628_cpu_optimized_iq4xm-00004-of-00004.gguf |
|
``` |
|
|
|
>[!TIP] |
|
>### ๐ Want to download faster than a caffeinated thirsty llama? Here's how: |
|
> |
|
>๐ง On Linux: `sudo apt install -y aria2` |
|
>๐ On Mac: `brew install aria2` |
|
> |
|
```bash |
|
sudo apt install -y aria2 |
|
``` |
|
|
|
```bash |
|
# ๐ For the turbocharged 4-bit IQ4XM version |
|
aria2c -x 8 -o deepseek_0628_cpu_optimized_iq4xm-00001-of-00004.gguf \ |
|
https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek_0628_cpu_optimized_iq4xm-00001-of-00004.gguf |
|
|
|
aria2c -x 8 -o deepseek_0628_cpu_optimized_iq4xm-00002-of-00004.gguf \ |
|
https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek_0628_cpu_optimized_iq4xm-00002-of-00004.gguf |
|
|
|
aria2c -x 8 -o deepseek_0628_cpu_optimized_iq4xm-00003-of-00004.gguf \ |
|
https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek_0628_cpu_optimized_iq4xm-00003-of-00004.gguf |
|
|
|
aria2c -x 8 -o deepseek_0628_cpu_optimized_iq4xm-00004-of-00004.gguf \ |
|
https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek_0628_cpu_optimized_iq4xm-00004-of-00004.gguf |
|
``` |
|
```bash |
|
# ๐๏ธ For the nearly lossless Q8_0 version |
|
aria2c -x 8 -o deepseek-0628-q8_0-00001-of-00006.gguf \ |
|
https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-q8_0-00001-of-00006.gguf |
|
|
|
aria2c -x 8 -o deepseek-0628-q8_0-00002-of-00006.gguf \ |
|
https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-q8_0-00002-of-00006.gguf |
|
|
|
aria2c -x 8 -o deepseek-0628-q8_0-00003-of-00006.gguf \ |
|
https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-q8_0-00003-of-00006.gguf |
|
|
|
aria2c -x 8 -o deepseek-0628-q8_0-00004-of-00006.gguf \ |
|
https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-q8_0-00004-of-00006.gguf |
|
|
|
aria2c -x 8 -o deepseek-0628-q8_0-00005-of-00006.gguf \ |
|
https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-q8_0-00005-of-00006.gguf |
|
|
|
aria2c -x 8 -o deepseek-0628-q8_0-00006-of-00006.gguf \ |
|
https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-q8_0-00006-of-00006.gguf |
|
``` |
|
```bash |
|
# ๐ง For the full-brain BF16 version |
|
aria2c -x 8 -o deepseek-0628-bf16-00001-of-00011.gguf \ |
|
https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-bf16-00001-of-00011.gguf |
|
|
|
aria2c -x 8 -o deepseek-0628-bf16-00002-of-00011.gguf \ |
|
https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-bf16-00002-of-00011.gguf |
|
|
|
aria2c -x 8 -o deepseek-0628-bf16-00003-of-00011.gguf \ |
|
https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-bf16-00003-of-00011.gguf |
|
|
|
aria2c -x 8 -o deepseek-0628-bf16-00004-of-00011.gguf \ |
|
https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-bf16-00004-of-00011.gguf |
|
|
|
aria2c -x 8 -o deepseek-0628-bf16-00005-of-00011.gguf \ |
|
https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-bf16-00005-of-00011.gguf |
|
|
|
aria2c -x 8 -o deepseek-0628-bf16-00006-of-00011.gguf \ |
|
https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-bf16-00006-of-00011.gguf |
|
|
|
aria2c -x 8 -o deepseek-0628-bf16-00007-of-00011.gguf \ |
|
https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-bf16-00007-of-00011.gguf |
|
|
|
aria2c -x 8 -o deepseek-0628-bf16-00008-of-00011.gguf \ |
|
https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-bf16-00008-of-00011.gguf |
|
|
|
aria2c -x 8 -o deepseek-0628-bf16-00009-of-00011.gguf \ |
|
https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-bf16-00009-of-00011.gguf |
|
|
|
aria2c -x 8 -o deepseek-0628-bf16-00010-of-00011.gguf \ |
|
https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-bf16-00010-of-00011.gguf |
|
|
|
aria2c -x 8 -o deepseek-0628-bf16-00011-of-00011.gguf \ |
|
https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-bf16-00011-of-00011.gguf |
|
``` |
|
|
|
๐ The use of DeepSeek-V2-Chat-0628 model is subject to the [DeepSeek Model License](https://github.com/deepseek-ai/DeepSeek-V2/blob/main/LICENSE-MODEL). DeepSeek-V2 series supports commercial use. It's a permissive license that only restricts use for military purposes, harming minors, or patent trolling. |
|
|
|
### ๐ Model Information |
|
|
|
DeepSeek-V2-Chat-0628 is the latest and greatest in the DeepSeek family. This AI powerhouse has climbed the LMSYS Chatbot Arena Leaderboard faster than a rocket on steroids: |
|
|
|
- ๐ Overall Arena Ranking: #11 global |
|
- ๐ป Coding Arena Ranking: #3, global |
|
- ๐ง Hard Prompts Arena Ranking: #7 global, better than claude opus even in english only hard-prompts |
|
|
|
Want to seek deeper into this model's ocean of awesomeness? Swim over to the [original model card](https://huggingface.co/deepseek-ai/DeepSeek-V2-Chat-0628) and prepare to have your mind blown! ๐คฏ |
|
|
|
Now go forth and accelerate ๐๐ก |
|
|
|
-Nisten |
|
|