--- base_model: VongolaChouko/Starcannon-Unleashed-12B-v1.0 library_name: transformers tags: - mergekit - merge - llama-cpp - gguf-my-repo license: cc-by-nc-4.0 --- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6720ed503a24966ac66495e8/HXc0AxPLkoIC1fy0Pb3Pb.png) Starcannon-Unleashed-12B-v1.0-GGUF ================================== Static Quantization of [**VongolaChouko/Starcannon-Unleashed-12B-v1.0**](https://huggingface.co/VongolaChouko/Starcannon-Unleashed-12B-v1.0). This model was converted to GGUF format from [VongolaChouko/Starcannon-Unleashed-12B-v1.0](https://huggingface.co/VongolaChouko/Starcannon-Unleashed-12B-v1.0) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space. Refer to the [original model card](https://huggingface.co/VongolaChouko/Starcannon-Unleashed-12B-v1.0) for more details on the model. **I recommend using them with [koboldcpp](https://github.com/LostRuins/koboldcpp). You can find their latest release here: [koboldcpp-1.76](https://github.com/LostRuins/koboldcpp/releases)** Recommended settings are here: [**Settings**](https://huggingface.co/VongolaChouko/Starcannon-Unleashed-12B-v1.0-GGUF#instruct)
## Download a file (not the whole branch) from below: | Filename | Quant type | File Size | Split | Description | | -------- | ---------- | --------- | ----- | ----------- | | [Starcannon-Unleashed-12B-v1.0-FP16.gguf](https://huggingface.co/VongolaChouko/Starcannon-Unleashed-12B-v1.0-GGUF/blob/main/Starcannon-Unleashed-12B-v1.0-FP16.gguf) | F16 | 24.50GB | false | Full F16 weights. | | [Starcannon-Unleashed-12B-v1.0-Q8_0.gguf](https://huggingface.co/VongolaChouko/Starcannon-Unleashed-12B-v1.0-GGUF/blob/main/Starcannon-Unleashed-12B-v1.0-Q8_0.gguf) | Q8_0 | 13.02GB | false | Extremely high quality, generally unneeded but max available quant. | | [Starcannon-Unleashed-12B-v1.0-Q6_K.gguf](https://huggingface.co/VongolaChouko/Starcannon-Unleashed-12B-v1.0-GGUF/blob/main/Starcannon-Unleashed-12B-v1.0-Q6_K.gguf) | Q6_K | 10.06GB | false | Very high quality, near perfect, *recommended*. | | [Starcannon-Unleashed-12B-v1.0-Q5_K_M.gguf](https://huggingface.co/VongolaChouko/Starcannon-Unleashed-12B-v1.0-GGUF/blob/main/Starcannon-Unleashed-12B-v1.0-Q5_K_M.gguf) | Q5_K_M | 8.73GB | false | High quality, *recommended*. | | [Starcannon-Unleashed-12B-v1.0-Q5_K_S.gguf](https://huggingface.co/VongolaChouko/Starcannon-Unleashed-12B-v1.0-GGUF/blob/main/Starcannon-Unleashed-12B-v1.0-Q5_K_S.gguf) | Q5_K_S | 8.52GB | false | High quality, *recommended*. | | [Starcannon-Unleashed-12B-v1.0-Q4_K_M.gguf](https://huggingface.co/VongolaChouko/Starcannon-Unleashed-12B-v1.0-GGUF/blob/main/Starcannon-Unleashed-12B-v1.0-Q4_K_M.gguf) | Q4_K_M | 7.48GB | false | Good quality, default size for must use cases, *recommended*. | | [Starcannon-Unleashed-12B-v1.0-Q4_K_S.gguf](https://huggingface.co/VongolaChouko/Starcannon-Unleashed-12B-v1.0-GGUF/blob/main/Starcannon-Unleashed-12B-v1.0-Q4_K_S.gguf) | Q4_K_S | 7.12GB | false | Slightly lower quality with more space savings, *recommended*. | | [Starcannon-Unleashed-12B-v1.0-Q4_0.gguf](https://huggingface.co/VongolaChouko/Starcannon-Unleashed-12B-v1.0-GGUF/blob/main/Starcannon-Unleashed-12B-v1.0-Q4_0.gguf) | Q4_0 | 7.09GB | false | Legacy format, generally not worth using over similarly sized formats | | [Starcannon-Unleashed-12B-v1.0-Q3_K_L.gguf](https://huggingface.co/VongolaChouko/Starcannon-Unleashed-12B-v1.0-GGUF/blob/main/Starcannon-Unleashed-12B-v1.0-Q3_K_L.gguf) | Q3_K_L | 6.56GB | false | Lower quality but usable, good for low RAM availability. | | [Starcannon-Unleashed-12B-v1.0-Q3_K_M.gguf](https://huggingface.co/VongolaChouko/Starcannon-Unleashed-12B-v1.0-GGUF/blob/main/Starcannon-Unleashed-12B-v1.0-Q3_K_M.gguf) | Q3_K_M | 6.08GB | false | Low quality. | | [Starcannon-Unleashed-12B-v1.0-Q3_K_S.gguf](https://huggingface.co/VongolaChouko/Starcannon-Unleashed-12B-v1.0-GGUF/blob/main/Starcannon-Unleashed-12B-v1.0-Q3_K_S.gguf) | Q3_K_S | 5.53GB | false | Low quality, not recommended. | | [Starcannon-Unleashed-12B-v1.0-Q2_K.gguf](https://huggingface.co/VongolaChouko/Starcannon-Unleashed-12B-v1.0-GGUF/blob/main/Starcannon-Unleashed-12B-v1.0-Q2_K.gguf) | Q2_K | 4.79GB | false | Very low quality but surprisingly usable. | ## Instruct Both ChatML and Mistral should work fine. Personally, I tested this using ChatML. I found that I like the model's responses better when I use this format. Try to test it out and observe which one you like best. :D ## Settings I recommend using these settings: [Starcannon-Unleashed-12B-v1.0-ST-Formatting-2024-10-29.json](https://huggingface.co/VongolaChouko/Starcannon-Unleashed-12B-v1.0/blob/main/Starcannon-Unleashed-12B-v1.0-ST-Formatting-2024-10-29.json) **IMPORTANT: Open Silly Tavern and use "Master Import", which can be found under "A" tab — Advanced Formatting. Replace the "INSERT WORLD HERE" placeholders with the world/universe in which your character belongs to. If not applicable, just remove that part.** ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6720ed503a24966ac66495e8/hAr6qvG3iWKKXOUP9Sy07.png) Temperature 1.15 - 1.25 is good, but lower should also work well, as long as you also tweak the Min P and XTC to ensure the model won't choke. Play around with it to see what suits your taste. This is a modified version of MarinaraSpaghetti's Mistral-Small-Correct.json, transformed into ChatML. You can find the original version here: [MarinaraSpaghetti/SillyTavern-Settings](https://huggingface.co/MarinaraSpaghetti/SillyTavern-Settings/tree/main/Customized) ## To use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) ```bash brew install llama.cpp ``` Invoke the llama.cpp server or the CLI. ### CLI: ```bash llama-cli --hf-repo VongolaChouko/Starcannon-Unleashed-12B-v1.0-Q6_K-GGUF --hf-file starcannon-unleashed-12b-v1.0-q6_k.gguf -p "The meaning to life and the universe is" ``` ### Server: ```bash llama-server --hf-repo VongolaChouko/Starcannon-Unleashed-12B-v1.0-Q6_K-GGUF --hf-file starcannon-unleashed-12b-v1.0-q6_k.gguf -c 2048 ``` Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well. Step 1: Clone llama.cpp from GitHub. ``` git clone https://github.com/ggerganov/llama.cpp ``` Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux). ``` cd llama.cpp && LLAMA_CURL=1 make ``` Step 3: Run inference through the main binary. ``` ./llama-cli --hf-repo VongolaChouko/Starcannon-Unleashed-12B-v1.0-Q6_K-GGUF --hf-file starcannon-unleashed-12b-v1.0-q6_k.gguf -p "The meaning to life and the universe is" ``` or ``` ./llama-server --hf-repo VongolaChouko/Starcannon-Unleashed-12B-v1.0-Q6_K-GGUF --hf-file starcannon-unleashed-12b-v1.0-q6_k.gguf -c 2048 ```