Squish42
/

WizardLM-7B-Uncensored-GPTQ-act_order-8bit

Text Generation

Inference Endpoints

Model card Files Files and versions Community

Squish42 commited on Jun 20, 2023

Commit

835febd

•

1 Parent(s): 03b47c5

README formatting

Files changed (1) hide show

README.md +6 -5

README.md CHANGED Viewed

@@ -4,7 +4,7 @@ license: unknown
 [ehartford/WizardLM-7B-Uncensored](https://huggingface.co/ehartford/WizardLM-7B-Uncensored) quantized to **8bit GPTQ** with act order + true sequential, no group size.
-*For most uses, this probably isn't what you want.*\
 *For 4bit with no act order or compatibility with `old-cuda` (text-generation-webui default) see [TheBloke/WizardLM-7B-uncensored-GPTQ](https://huggingface.co/TheBloke/WizardLM-7B-uncensored-GPTQ)*
 Quantized using AutoGPTQ with the following config:
@@ -17,11 +17,12 @@ config: dict = dict(
 See `quantize.py` for the full script.
 Tested for compatibility with:
-WSL with GPTQ-for-Llama `triton` branch.
-Windows with AutoGPTQ on `cuda` (triton deselected)
-AutoGPTQ loader should read configuration from `quantize_config.json`
 For GPTQ-for-Llama use the following configuration when loading:\
 wbits: 8\
 groupsize: None\
-model_type: llama\

 [ehartford/WizardLM-7B-Uncensored](https://huggingface.co/ehartford/WizardLM-7B-Uncensored) quantized to **8bit GPTQ** with act order + true sequential, no group size.
+*For most uses this probably isn't what you want.* \
 *For 4bit with no act order or compatibility with `old-cuda` (text-generation-webui default) see [TheBloke/WizardLM-7B-uncensored-GPTQ](https://huggingface.co/TheBloke/WizardLM-7B-uncensored-GPTQ)*
 Quantized using AutoGPTQ with the following config:
 See `quantize.py` for the full script.
 Tested for compatibility with:
+- WSL with GPTQ-for-Llama `triton` branch.
+- Windows with AutoGPTQ on `cuda` (triton deselected)
+AutoGPTQ loader should read configuration from `quantize_config.json`.\
 For GPTQ-for-Llama use the following configuration when loading:\
 wbits: 8\
 groupsize: None\
+model_type: llama