webpolis commited on
Commit
353e28a
1 Parent(s): aa29592

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -1
README.md CHANGED
@@ -83,7 +83,18 @@ Currently, the HuggingFace's Inference Tool UI doesn't properly load the model.
83
 
84
  ## CPU
85
 
86
- Check-out [webpolis/zenos-gpt-j-6B-instruct-cpu](https://huggingface.co/webpolis/zenos-gpt-j-6B-instruct-cpu)
 
 
 
 
 
 
 
 
 
 
 
87
 
88
  # Acknowledgments
89
 
 
83
 
84
  ## CPU
85
 
86
+ Best performance can be achieved downloading the [GGML 4 bits](https://huggingface.co/webpolis/zenos-gpt-j-6B-instruct-4bit/resolve/main/ggml-f16-q4_0.bin) model and doing inference with the [rustformers' llm](https://github.com/rustformers/llm) tool.
87
+
88
+ In my Core i7 laptop it goes around 255ms per token:
89
+
90
+ ![](https://huggingface.co/webpolis/zenos-gpt-j-6B-instruct-4bit/resolve/main/poema1.gif)
91
+
92
+ ### Requirements
93
+
94
+ For optimal performance:
95
+
96
+ - 4 CPU cores
97
+ - 8GB RAM
98
 
99
  # Acknowledgments
100