Update README.md
Browse files
README.md
CHANGED
@@ -23,6 +23,10 @@ base_model: deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct
|
|
23 |
**Original model**: [DeepSeek-Coder-V2-Lite-Instruct](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct)<br>
|
24 |
**GGUF quantization:** provided by [bartowski](https://huggingface.co/bartowski) based on `llama.cpp` release [b3166](https://github.com/ggerganov/llama.cpp/releases/tag/b3166)<br>
|
25 |
|
|
|
|
|
|
|
|
|
26 |
## Model Summary:
|
27 |
|
28 |
This is a brand new Mixture of Export (MoE) model from DeepSeek, specializing in coding instructions.<br>
|
@@ -42,6 +46,7 @@ This will format the prompt as follows:
|
|
42 |
User: {user_message}
|
43 |
|
44 |
Assistant: {assistant_message}
|
|
|
45 |
|
46 |
## Technical Details
|
47 |
|
|
|
23 |
**Original model**: [DeepSeek-Coder-V2-Lite-Instruct](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct)<br>
|
24 |
**GGUF quantization:** provided by [bartowski](https://huggingface.co/bartowski) based on `llama.cpp` release [b3166](https://github.com/ggerganov/llama.cpp/releases/tag/b3166)<br>
|
25 |
|
26 |
+
## Model Settings:
|
27 |
+
|
28 |
+
Flash attention MUST be **disabled** for this model to work.
|
29 |
+
|
30 |
## Model Summary:
|
31 |
|
32 |
This is a brand new Mixture of Export (MoE) model from DeepSeek, specializing in coding instructions.<br>
|
|
|
46 |
User: {user_message}
|
47 |
|
48 |
Assistant: {assistant_message}
|
49 |
+
```
|
50 |
|
51 |
## Technical Details
|
52 |
|