joongi007
/

Ko-Qwen2-7B-Instruct-GGUF

Inference Endpoints

Model card Files Files and versions Community

joongi007 commited on Aug 5

Commit

095d0be

•

1 Parent(s): 6b13ab8

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -9,7 +9,7 @@ model-index:
 ---
 - Original model is [spow12/Ko-Qwen2-7B-Instruct](https://huggingface.co/spow12/Ko-Qwen2-7B-Instruct)
-- quantized using [llama.cpp](https://github.com/ggerganov/llama.cpp)
 ```prompt
 <|im_start|>system
@@ -19,4 +19,4 @@ model-index:
 <|im_start|>assistant
 {Assistant}
 ```
-"Flash Attention" function must be activated. [why?](https://www.reddit.com/r/LocalLLaMA/comments/1da19nu/if_your_qwen2_gguf_is_spitting_nonsense_enable/)

 ---
 - Original model is [spow12/Ko-Qwen2-7B-Instruct](https://huggingface.co/spow12/Ko-Qwen2-7B-Instruct)
+- quantized using [llama.cpp](https://github.com/ggerganov/llama.cpp) - [3510](https://github.com/ggerganov/llama.cpp/releases/tag/b3510)
 ```prompt
 <|im_start|>system
 <|im_start|>assistant
 {Assistant}
 ```
+~~"Flash Attention" function must be activated. [why?](https://www.reddit.com/r/LocalLLaMA/comments/1da19nu/if_your_qwen2_gguf_is_spitting_nonsense_enable/)~~