TheBloke
/

Llama-2-70B-Chat-GPTQ

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions Community

TheBloke commited on Aug 1, 2023

Commit

0829129

•

1 Parent(s): 5d91b2e

Update README.md

Files changed (1) hide show

README.md +4 -7

README.md CHANGED Viewed

@@ -45,16 +45,13 @@ Now that we have ExLlama, that is the recommended loader to use for these models
 Reminder: ExLlama does not support 3-bit models, so if you wish to try those quants, you will need to use AutoGPTQ or GPTQ-for-LLaMa.
-## AutoGPTQ and GPTQ-for-LLaMa requires latest version of Transformers
-If you plan to use any of these quants with AutoGPTQ or GPTQ-for-LLaMa, your Transformers needs to be be using the latest Github code.
-If you're using text-generation-webui and have updated to the latest version, this is done for you automatically.
-If not, you can update it manually with:
 ```
-pip3 install git+https://github.com/huggingface/transformers
 ```
 ## Repositories available

 Reminder: ExLlama does not support 3-bit models, so if you wish to try those quants, you will need to use AutoGPTQ or GPTQ-for-LLaMa.
+## AutoGPTQ and GPTQ-for-LLaMa compatibility
+Please update AutoGPTQ to version 0.3.1 or later. This will also update Transformers to 4.31.0, which is required for Llama 70B compatibility.
+If you're using GPTQ-for-LLaMa, please update Transformers manually with:
 ```
+pip3 install "transformers>=4.31.0"
 ```
 ## Repositories available