Update README.md
Browse files
README.md
CHANGED
@@ -83,7 +83,18 @@ Currently, the HuggingFace's Inference Tool UI doesn't properly load the model.
|
|
83 |
|
84 |
## CPU
|
85 |
|
86 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
87 |
|
88 |
# Acknowledgments
|
89 |
|
|
|
83 |
|
84 |
## CPU
|
85 |
|
86 |
+
Best performance can be achieved downloading the [GGML 4 bits](https://huggingface.co/webpolis/zenos-gpt-j-6B-instruct-4bit/resolve/main/ggml-f16-q4_0.bin) model and doing inference with the [rustformers' llm](https://github.com/rustformers/llm) tool.
|
87 |
+
|
88 |
+
In my Core i7 laptop it goes around 255ms per token:
|
89 |
+
|
90 |
+
![](https://huggingface.co/webpolis/zenos-gpt-j-6B-instruct-4bit/resolve/main/poema1.gif)
|
91 |
+
|
92 |
+
### Requirements
|
93 |
+
|
94 |
+
For optimal performance:
|
95 |
+
|
96 |
+
- 4 CPU cores
|
97 |
+
- 8GB RAM
|
98 |
|
99 |
# Acknowledgments
|
100 |
|