Update README.md
Browse files
README.md
CHANGED
@@ -42,14 +42,17 @@ Details of the files provided:
|
|
42 |
|
43 |
## How to run in `text-generation-webui`
|
44 |
|
45 |
-
The `safetensors` model file was created with the
|
46 |
|
47 |
Here are the commands I used to clone the Triton branch of GPTQ-for-LLaMa, clone text-generation-webui, and install GPTQ into the UI:
|
48 |
```
|
49 |
-
|
|
|
|
|
|
|
50 |
git clone https://github.com/oobabooga/text-generation-webui
|
51 |
mkdir -p text-generation-webui/repositories
|
52 |
-
ln -s
|
53 |
```
|
54 |
|
55 |
Then install this model into `text-generation-webui/models` and launch the UI as follows:
|
@@ -60,7 +63,7 @@ python server.py --model gpt4-alpaca-lora-30B-GPTQ-4bit-128g --wbits 4 --groupsi
|
|
60 |
|
61 |
The above commands assume you have installed all dependencies for GPTQ-for-LLaMa and text-generation-webui. Please see their respective repositories for further information.
|
62 |
|
63 |
-
If you are on Windows, or cannot use the Triton branch of GPTQ for any other reason, you can instead
|
64 |
```
|
65 |
git clone https://github.com/qwopqwop200/GPTQ-for-LLaMa -b cuda
|
66 |
cd GPTQ-for-LLaMa
|
|
|
42 |
|
43 |
## How to run in `text-generation-webui`
|
44 |
|
45 |
+
The `safetensors` model file was created with the GPTQ-for-LLaMa code as of April 13th, and uses `--act-order` to give the maximum possible quantisation quality. This means it requires that this same version of GPTQ-for-LLaMa is used inside the UI.
|
46 |
|
47 |
Here are the commands I used to clone the Triton branch of GPTQ-for-LLaMa, clone text-generation-webui, and install GPTQ into the UI:
|
48 |
```
|
49 |
+
# Since April 14th we can't clone the latest GPTQ-for-LLaMa as it's in the middle of a refactoring
|
50 |
+
git clone -n https://github.com/qwopqwop200/GPTQ-for-LLaMa gptq-working
|
51 |
+
cd gptq-working && git checkout 58c8ab4c7aaccc50f507fd08cce941976affe5e0 # Later commits are currently broken due to ongoing refactoring
|
52 |
+
|
53 |
git clone https://github.com/oobabooga/text-generation-webui
|
54 |
mkdir -p text-generation-webui/repositories
|
55 |
+
ln -s gptq-working text-generation-webui/repositories/GPTQ-for-LLaMa
|
56 |
```
|
57 |
|
58 |
Then install this model into `text-generation-webui/models` and launch the UI as follows:
|
|
|
63 |
|
64 |
The above commands assume you have installed all dependencies for GPTQ-for-LLaMa and text-generation-webui. Please see their respective repositories for further information.
|
65 |
|
66 |
+
If you are on Windows, or cannot use the Triton branch of GPTQ for any other reason, you can instead try the CUDA branch:
|
67 |
```
|
68 |
git clone https://github.com/qwopqwop200/GPTQ-for-LLaMa -b cuda
|
69 |
cd GPTQ-for-LLaMa
|