|
--- |
|
license: unknown |
|
library_name: transformers |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
# Deepseek-V2-Chat-GGUF |
|
|
|
Quantizised from [https://huggingface.co/deepseek-ai/DeepSeek-V2-Chat](https://huggingface.co/deepseek-ai/DeepSeek-V2-Chat) |
|
|
|
Using llama.cpp fork: [https://github.com/fairydreaming/llama.cpp/tree/deepseek-v2](https://github.com/fairydreaming/llama.cpp/tree/deepseek-v2) |
|
|
|
# Warning: This will not work unless you compile llama.cpp from the repo provided! |
|
|
|
# How to use: |
|
|
|
- Find the relevant directory |
|
- Download all files |
|
- Run merge.py |
|
- Merged GGUF should appear |
|
|
|
# Quants: |
|
- bf16 (generating, 85% complete) |
|
- f16 (after q4_k_m, but just use bf16) |
|
- f32 (may require some time to upload, after q8_0) |
|
- q8_0 (after bf16) |
|
- q4_k_m (after q8_0) |
|
|
|
If quantize.exe supports it I will make RTN quants. |