LavaPlanet
commited on
Commit
•
4176785
1
Parent(s):
de46745
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,20 @@
|
|
1 |
---
|
2 |
license: llama2
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: llama2
|
3 |
---
|
4 |
+
Another EXL2 version of AlpinDale's https://huggingface.co/alpindale/goliath-120b this one being at 2.64BPW.
|
5 |
+
|
6 |
+
[2.37BPW](https://huggingface.co/LavaPlanet/Goliath120B-exl2-2.37bpw)
|
7 |
+
|
8 |
+
Pippa llama2 Chat was used as the calibration dataset.
|
9 |
+
|
10 |
+
Can be run on two RTX 3090s w/ 24GB vram each.
|
11 |
+
|
12 |
+
Assuming Windows overhead, the following figures should be more or less close enough for estimation of your own use.
|
13 |
+
```yaml
|
14 |
+
2.64BPW @ 4096 ctx
|
15 |
+
Empty Ctx
|
16 |
+
GPU Split:18/24
|
17 |
+
GPU1: 19.8/24
|
18 |
+
GPU2: 21.9/24
|
19 |
+
10~ tk/s
|
20 |
+
```
|