TheBloke
/

Vicuna-13B-1.1-GPTQ

Text Generation

text-generation-inference

Model card Files Files and versions Community

TheBloke commited on Apr 29, 2023

Commit

93db929

•

1 Parent(s): 4cbf7c3

Update README.md

Files changed (1) hide show

README.md +12 -12

README.md CHANGED Viewed

@@ -69,23 +69,23 @@ Details of the files provided:
   * Command to create:
     * `python3 llama.py vicuna-13B-1.1-HF c4 --wbits 4 --true-sequential --act-order --groupsize 128 --save_safetensors vicuna-13B-1.1-GPTQ-4bit-128g.safetensors`
-## How to run in `text-generation-webui`
-File `vicuna-13B-1.1-GPTQ-4bit-128g.no-act-order.pt` can be loaded the same as any other GPTQ file, without requiring any updates to [oobaboogas text-generation-webui](https://github.com/oobabooga/text-generation-webui).
-The `safetensors` model file was created with the latest GPTQ code, and uses `--act-order` to give the maximum possible quantisation quality, but this means it requires that the latest GPTQ-for-LLaMa is used inside the UI.
-If you want to use the `safetensors` file and need to update GPTQ-for-LLaMa, here are the commands I used to clone the Triton branch of GPTQ-for-LLaMa, clone text-generation-webui, and install GPTQ into the UI:
-```
-# We need to clone GPTQ-for-LLaMa as of April 13th, due to breaking changes in more recent commits
-git clone -n  https://github.com/qwopqwop200/GPTQ-for-LLaMa gptq-safe
-cd gptq-safe && git checkout 58c8ab4c7aaccc50f507fd08cce941976affe5e0
-# Now clone text-generation-webui, if you don't already have it
 git clone https://github.com/oobabooga/text-generation-webui
-# And link GPTQ-for-Llama into text-generation-webui
-mkdir -p text-generation-webui/repositories
-ln -s gptq-safe text-generation-webui/repositories/GPTQ-for-LLaMa
 ```
 Then install this model into `text-generation-webui/models` and launch the UI as follows:

   * Command to create:
     * `python3 llama.py vicuna-13B-1.1-HF c4 --wbits 4 --true-sequential --act-order --groupsize 128 --save_safetensors vicuna-13B-1.1-GPTQ-4bit-128g.safetensors`
+## Manual instructions for `text-generation-webui`
+File `vicuna-13B-1.1-GPTQ-4bit-128g.compat.no-act-order.pt` can be loaded the same as any other GPTQ file, without requiring any updates to [oobaboogas text-generation-webui](https://github.com/oobabooga/text-generation-webui).
+[Instructions on using GPTQ 4bit files in text-generation-webui are here](https://github.com/oobabooga/text-generation-webui/wiki/GPTQ-models-\(4-bit-mode\)).
+The other `safetensors` model file was created using `--act-order` to give the maximum possible quantisation quality, but this means it requires that the latest GPTQ-for-LLaMa is used inside the UI.
+If you want to use the act-order `safetensors` files and need to update the Triton branch of GPTQ-for-LLaMa, here are the commands I used to clone the Triton branch of GPTQ-for-LLaMa, clone text-generation-webui, and install GPTQ into the UI:
+```
+# Clone text-generation-webui, if you don't already have it
 git clone https://github.com/oobabooga/text-generation-webui
+# Make a repositories directory
+mkdir text-generation-webui/repositories
+cd text-generation-webui/repositories
+# Clone the latest GPTQ-for-LLaMa code inside text-generation-webui
+git clone https://github.com/qwopqwop200/GPTQ-for-LLaMa
 ```
 Then install this model into `text-generation-webui/models` and launch the UI as follows: