webpolis
/

zenos-gpt-j-6B-instruct-4bit

Text Generation

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

webpolis commited on Jan 18

Commit

c8a70ff

•

1 Parent(s): 44e7797

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -21,6 +21,7 @@ An experimental Twitter (**X**) bot is available at [https://twitter.com/ZenosBo
 The latest development version of Transformers, which includes serialization of 4 bits models.
 - [Transformers](https://huggingface.co/docs/transformers/installation#install-from-source)
 Since this is a compressed version (4 bits), it can fit into ~7GB of VRAM.
@@ -33,7 +34,7 @@ from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
 # Load the tokenizer and model
 tokenizer = AutoTokenizer.from_pretrained("webpolis/zenos-gpt-j-6B-instruct-4bit")
-model = AutoModelForCausalLM.from_pretrained("webpolis/zenos-gpt-j-6B-instruct-4bit")
 user_msg = '''Escribe un poema breve utilizando los siguientes conceptos:

 The latest development version of Transformers, which includes serialization of 4 bits models.
 - [Transformers](https://huggingface.co/docs/transformers/installation#install-from-source)
+- Bitsandbytes >= 0.41.3
 Since this is a compressed version (4 bits), it can fit into ~7GB of VRAM.
 # Load the tokenizer and model
 tokenizer = AutoTokenizer.from_pretrained("webpolis/zenos-gpt-j-6B-instruct-4bit")
+model = AutoModelForCausalLM.from_pretrained("webpolis/zenos-gpt-j-6B-instruct-4bit", use_safetensors=True)
 user_msg = '''Escribe un poema breve utilizando los siguientes conceptos: