nisten commited on
Commit
124e2e1
1 Parent(s): 2498189

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -5
README.md CHANGED
@@ -5,15 +5,13 @@ base_model: [deepseek-ai/DeepSeek-V2-Chat-0628]
5
  #### 🚀 Custom quantizations of DeepSeek-V2-Chat-0628 supercharged for CPU inference of currently the #7 model globally on lmsys arena hard! 🖥️
6
 
7
 
8
- >[!TIP]
9
  >### 🚄 Just download this IQ4XM 131Gb version, it's the one I use myself:
10
- >
11
  >🐧 On Linux: `sudo apt install -y aria2`
12
  >
13
  >🍎 On Mac: `brew install aria2`
14
- >
15
 
16
- ```bash
17
  aria2c -x 9 -o deepseek_0628_cpu_optimized_iq4xm-00001-of-00004.gguf \
18
  https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek_0628_cpu_optimized_iq4xm-00001-of-00004.gguf
19
 
@@ -26,7 +24,12 @@ aria2c -x 9 -o deepseek_0628_cpu_optimized_iq4xm-00003-of-00004.gguf \
26
  aria2c -x 9 -o deepseek_0628_cpu_optimized_iq4xm-00004-of-00004.gguf \
27
  https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek_0628_cpu_optimized_iq4xm-00004-of-00004.gguf
28
  ```
29
-
 
 
 
 
 
30
  ### 🧠 This IQ4XM version uses GGML TYPE IQ_4_XS 4bit in combination with q8_0 bit for blazing fast performance with minimal loss, leveraging int8 optimizations on most newer server CPUs.
31
  ### 🛠️ While it required some custom code wizardry, it's fully compatible with standard llama.cpp from GitHub or just search for nisten in lmstudio.
32
 
 
5
  #### 🚀 Custom quantizations of DeepSeek-V2-Chat-0628 supercharged for CPU inference of currently the #7 model globally on lmsys arena hard! 🖥️
6
 
7
 
8
+
9
  >### 🚄 Just download this IQ4XM 131Gb version, it's the one I use myself:
 
10
  >🐧 On Linux: `sudo apt install -y aria2`
11
  >
12
  >🍎 On Mac: `brew install aria2`
 
13
 
14
+ ```verilog
15
  aria2c -x 9 -o deepseek_0628_cpu_optimized_iq4xm-00001-of-00004.gguf \
16
  https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek_0628_cpu_optimized_iq4xm-00001-of-00004.gguf
17
 
 
24
  aria2c -x 9 -o deepseek_0628_cpu_optimized_iq4xm-00004-of-00004.gguf \
25
  https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek_0628_cpu_optimized_iq4xm-00004-of-00004.gguf
26
  ```
27
+ >[!TIP]
28
+ >//then to have a commandline conversation interface all you need is:
29
+ ```bash
30
+ git clone https://github.com/ggerganov/llama.cpp && cd llama.cpp && make -j
31
+ ./llama-cli -m ~/r/deepseek_0628_cpu_optimized_iq4xm-00001-of-00004.gguf -t 62 --temp 0.4 -co -cnv -i -c 3000 -p "Adopt the persona of a full-stack developer at NASA JPL."
32
+ ```
33
  ### 🧠 This IQ4XM version uses GGML TYPE IQ_4_XS 4bit in combination with q8_0 bit for blazing fast performance with minimal loss, leveraging int8 optimizations on most newer server CPUs.
34
  ### 🛠️ While it required some custom code wizardry, it's fully compatible with standard llama.cpp from GitHub or just search for nisten in lmstudio.
35