mlabonne commited on
Commit
cac9082
1 Parent(s): 815b9f1

Update how to load the model with model_basename

Browse files
Files changed (1) hide show
  1. README.md +10 -5
README.md CHANGED
@@ -20,13 +20,18 @@ pip install auto-gptq
20
  You can then download the model from the hub using the following code:
21
 
22
  ```python
 
23
  from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
24
- from transformers import AutoTokenizer
25
 
26
- model_id = "mlabonne/gpt2-GPTQ-4bit"
27
- quantize_config = BaseQuantizeConfig(bits=4, group_size=128)
28
- model = AutoGPTQForCausalLM.from_pretrained(model_id, quantize_config)
29
- tokenizer = AutoTokenizer.from_pretrained(model_id)
 
 
 
 
 
30
  ```
31
 
32
  This model works with the traditional [Text Generation pipeline](https://huggingface.co/docs/transformers/main_classes/pipelines#transformers.TextGenerationPipeline).
 
20
  You can then download the model from the hub using the following code:
21
 
22
  ```python
23
+ from transformers import AutoModelForCausalLM, AutoTokenizer
24
  from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
 
25
 
26
+ model_name = "mlabonne/gpt2-GPTQ-4bit"
27
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
28
+ quantize_config = BaseQuantizeConfig.from_pretrained(model_name)
29
+ model = AutoGPTQForCausalLM.from_quantized(model_name,
30
+ model_basename="gptq_model-4bit-128g",
31
+ device="cuda:0",
32
+ use_triton=True,
33
+ use_safetensors=True,
34
+ quantize_config=quantize_config)
35
  ```
36
 
37
  This model works with the traditional [Text Generation pipeline](https://huggingface.co/docs/transformers/main_classes/pipelines#transformers.TextGenerationPipeline).