README formatting
Browse files
README.md
CHANGED
@@ -4,7 +4,7 @@ license: unknown
|
|
4 |
|
5 |
[ehartford/WizardLM-7B-Uncensored](https://huggingface.co/ehartford/WizardLM-7B-Uncensored) quantized to **8bit GPTQ** with act order + true sequential, no group size.
|
6 |
|
7 |
-
*For most uses
|
8 |
*For 4bit with no act order or compatibility with `old-cuda` (text-generation-webui default) see [TheBloke/WizardLM-7B-uncensored-GPTQ](https://huggingface.co/TheBloke/WizardLM-7B-uncensored-GPTQ)*
|
9 |
|
10 |
Quantized using AutoGPTQ with the following config:
|
@@ -17,11 +17,12 @@ config: dict = dict(
|
|
17 |
See `quantize.py` for the full script.
|
18 |
|
19 |
Tested for compatibility with:
|
20 |
-
WSL with GPTQ-for-Llama `triton` branch.
|
21 |
-
Windows with AutoGPTQ on `cuda` (triton deselected)
|
22 |
|
23 |
-
AutoGPTQ loader should read configuration from `quantize_config.json
|
24 |
For GPTQ-for-Llama use the following configuration when loading:\
|
25 |
wbits: 8\
|
26 |
groupsize: None\
|
27 |
-
model_type: llama
|
|
|
|
4 |
|
5 |
[ehartford/WizardLM-7B-Uncensored](https://huggingface.co/ehartford/WizardLM-7B-Uncensored) quantized to **8bit GPTQ** with act order + true sequential, no group size.
|
6 |
|
7 |
+
*For most uses this probably isn't what you want.* \
|
8 |
*For 4bit with no act order or compatibility with `old-cuda` (text-generation-webui default) see [TheBloke/WizardLM-7B-uncensored-GPTQ](https://huggingface.co/TheBloke/WizardLM-7B-uncensored-GPTQ)*
|
9 |
|
10 |
Quantized using AutoGPTQ with the following config:
|
|
|
17 |
See `quantize.py` for the full script.
|
18 |
|
19 |
Tested for compatibility with:
|
20 |
+
- WSL with GPTQ-for-Llama `triton` branch.
|
21 |
+
- Windows with AutoGPTQ on `cuda` (triton deselected)
|
22 |
|
23 |
+
AutoGPTQ loader should read configuration from `quantize_config.json`.\
|
24 |
For GPTQ-for-Llama use the following configuration when loading:\
|
25 |
wbits: 8\
|
26 |
groupsize: None\
|
27 |
+
model_type: llama
|
28 |
+
|