cgus
/

gemma-2-2b-it-abliterated-exl2

4-bit precision

Model card Files Files and versions Community

cgus commited on Aug 18

Commit

718969f

•

1 Parent(s): d05cbeb

Update README.md

Files changed (1) hide show

README.md +23 -1

README.md CHANGED Viewed

@@ -2,8 +2,30 @@
 license: gemma
 language:
 - en
 ---
 # Abliterated Gemma 2 2B
 [Abliterated](https://huggingface.co/blog/mlabonne/abliteration) version of [google/gemma-2-2b-it](https://huggingface.co/google/gemma-2-2b-it).

 license: gemma
 language:
 - en
+base_model: IlyaGusev/gemma-2-2b-it-abliterated
 ---
+# gemma-2-2b-it-abliterated-exl2
+Model: [gemma-2-2b-it-abliterated](https://huggingface.co/IlyaGusev/gemma-2-2b-it-abliterated)
+Made by: [IlyaGusev](https://huggingface.co/IlyaGusev)
+## Quants
+[4bpw h6 (main)](https://huggingface.co/cgus/gemma-2-2b-it-abliterated-exl2/tree/main)
+[4.5bpw h6](https://huggingface.co/cgus/gemma-2-2b-it-abliterated-exl2/tree/4.5bpw-h6)
+[5bpw h6](https://huggingface.co/cgus/gemma-2-2b-it-abliterated-exl2/tree/5bpw-h6)
+[6bpw h6](https://huggingface.co/cgus/gemma-2-2b-it-abliterated-exl2/tree/6bpw-h6)
+[8bpw h8](https://huggingface.co/cgus/gemma-2-2b-it-abliterated-exl2/tree/8bpw-h8)
+## Quantization notes
+Made with Exllamav2 0.1.8 with the default dataset.
+I hoped that this model could fit some 2GB GPUs like my old Quadro T400 that kinda what RTX2030 could be if it existed.
+But even 4bpw version is bigger than 2GB, I didn't expect this since some 2-3B models had smaller model files.
+It doesn't make much sense to make exl2 quants for such small models but I needed some small but good model for making prompts for Flux.
+Since my system can't handle loading a big model and Flux at the same time.
+I couldn't load this model with Exllamav2 0.1.7 so using 0.1.8 or newer might be necessary.
+## How to use
+This model version can be loaded with apps that have Exllamav2 loader: Text-Generation-WebUI, TabbyAPI, possibly KoboldAI, etc.
+# Original model card
 # Abliterated Gemma 2 2B
 [Abliterated](https://huggingface.co/blog/mlabonne/abliteration) version of [google/gemma-2-2b-it](https://huggingface.co/google/gemma-2-2b-it).