TheBloke commited on
Commit
4cbf7c3
1 Parent(s): 515c0db

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -6
README.md CHANGED
@@ -28,6 +28,20 @@ I have the following Vicuna 1.1 repositories available:
28
  * [GPTQ quantized 4bit 7B 1.1 for GPU - `safetensors` and `pt` formats](https://huggingface.co/TheBloke/vicuna-7B-1.1-GPTQ-4bit-128g)
29
  * [GPTQ quantized 4bit 7B 1.1 for CPU - GGML format for `llama.cpp`](https://huggingface.co/TheBloke/vicuna-7B-1.1-GPTQ-4bit-128g-GGML)
30
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
31
  ## GIBBERISH OUTPUT
32
 
33
  If you get gibberish output, it is because you are using the `safetensors` file without updating GPTQ-for-LLaMA.
@@ -43,17 +57,18 @@ Either way, please read the instructions below carefully.
43
  Two model files are provided. Ideally use the `safetensors` file. Full details below:
44
 
45
  Details of the files provided:
46
- * `vicuna-13B-1.1-GPTQ-4bit-128g.safetensors`
47
- * `safetensors` format, with improved file security, created with the latest [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa) code.
48
- * Command to create:
49
- * `python3 llama.py vicuna-13B-1.1-HF c4 --wbits 4 --true-sequential --act-order --groupsize 128 --save_safetensors vicuna-13B-1.1-GPTQ-4bit-128g.safetensors`
50
- * `vicuna-13B-1.1-GPTQ-4bit-128g.no-act-order.pt`
51
  * `pt` format file, created without the `--act-order` flag.
52
  * This file may have slightly lower quality, but is included as it can be used without needing to compile the latest GPTQ-for-LLaMa code.
53
- * It should hopefully therefore work with one-click-installers on Windows, which include the older GPTQ-for-LLaMa code.
54
  * Command to create:
55
  * `python3 llama.py vicuna-13B-1.1-HF c4 --wbits 4 --true-sequential --groupsize 128 --save_safetensors vicuna-13B-1.1-GPTQ-4bit-128g.no-act-order.pt`
56
 
 
 
 
 
 
57
  ## How to run in `text-generation-webui`
58
 
59
  File `vicuna-13B-1.1-GPTQ-4bit-128g.no-act-order.pt` can be loaded the same as any other GPTQ file, without requiring any updates to [oobaboogas text-generation-webui](https://github.com/oobabooga/text-generation-webui).
 
28
  * [GPTQ quantized 4bit 7B 1.1 for GPU - `safetensors` and `pt` formats](https://huggingface.co/TheBloke/vicuna-7B-1.1-GPTQ-4bit-128g)
29
  * [GPTQ quantized 4bit 7B 1.1 for CPU - GGML format for `llama.cpp`](https://huggingface.co/TheBloke/vicuna-7B-1.1-GPTQ-4bit-128g-GGML)
30
 
31
+ ## How to easily download and use this model in text-generation-webui
32
+
33
+ Load text-generation-webui as you normally do.
34
+
35
+ 1. Click the **Model tab**.
36
+ 2. Under **Download custom model or LoRA**, enter this repo name: `TheBloke/vicuna-13B-1.1-GPTQ-4bit-128g`.
37
+ 3. Click **Download**.
38
+ 4. Wait until it says it's finished downloading.
39
+ 5. As this is a GPTQ model, fill in the `GPTQ parameters` on the right: `Bits = 4`, `Groupsize = 128`, `model_type = Llama`
40
+ 6. Now click the **Refresh** icon next to **Model** in the top left.
41
+ 7. In the **Model drop-down**: choose this model: `vicuna-13B-1.1-GPTQ-4bit-128g`.
42
+ 8. Click **Reload the Model** in the top right.
43
+ 9. Once it says it's loaded, click the **Text Generation tab** and enter a prompt!
44
+
45
  ## GIBBERISH OUTPUT
46
 
47
  If you get gibberish output, it is because you are using the `safetensors` file without updating GPTQ-for-LLaMA.
 
57
  Two model files are provided. Ideally use the `safetensors` file. Full details below:
58
 
59
  Details of the files provided:
60
+ * `vicuna-13B-1.1-GPTQ-4bit-128g.compat.no-act-order.pt`
 
 
 
 
61
  * `pt` format file, created without the `--act-order` flag.
62
  * This file may have slightly lower quality, but is included as it can be used without needing to compile the latest GPTQ-for-LLaMa code.
63
+ * It will therefore work with one-click-installers on Windows, which include the older GPTQ-for-LLaMa code.
64
  * Command to create:
65
  * `python3 llama.py vicuna-13B-1.1-HF c4 --wbits 4 --true-sequential --groupsize 128 --save_safetensors vicuna-13B-1.1-GPTQ-4bit-128g.no-act-order.pt`
66
 
67
+ * `vicuna-13B-1.1-GPTQ-4bit-128g.latest.safetensors`
68
+ * `safetensors` format, with improved file security, created with the latest [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa) code.
69
+ * Command to create:
70
+ * `python3 llama.py vicuna-13B-1.1-HF c4 --wbits 4 --true-sequential --act-order --groupsize 128 --save_safetensors vicuna-13B-1.1-GPTQ-4bit-128g.safetensors`
71
+
72
  ## How to run in `text-generation-webui`
73
 
74
  File `vicuna-13B-1.1-GPTQ-4bit-128g.no-act-order.pt` can be loaded the same as any other GPTQ file, without requiring any updates to [oobaboogas text-generation-webui](https://github.com/oobabooga/text-generation-webui).