LavaPlanet
/

Goliath120B-exl2-2.64bpw

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

LavaPlanet commited on Nov 18, 2023

Commit

4176785

•

1 Parent(s): de46745

Update README.md

Files changed (1) hide show

README.md +17 -0

README.md CHANGED Viewed

@@ -1,3 +1,20 @@
 ---
 license: llama2
 ---

 ---
 license: llama2
 ---
+Another EXL2 version of AlpinDale's https://huggingface.co/alpindale/goliath-120b this one being at 2.64BPW.
+[2.37BPW](https://huggingface.co/LavaPlanet/Goliath120B-exl2-2.37bpw)
+Pippa llama2 Chat was used as the calibration dataset.
+Can be run on two RTX 3090s w/ 24GB vram each.
+Assuming Windows overhead, the following figures should be more or less close enough for estimation of your own use.
+```yaml
+2.64BPW @ 4096 ctx
+  Empty Ctx
+    GPU Split:18/24
+    GPU1: 19.8/24
+    GPU2: 21.9/24
+    10~ tk/s
+```