Latest GGML v2 format for LLaMa-7B
Browse files- README.md +29 -0
- llama-7b.fp16.ggml.bin +3 -0
- llama-7b.ggml.q4_0.bin +3 -0
- llama-7b.ggml.q4_1.bin +3 -0
- llama-7b.ggml.q5_0.bin +3 -0
- llama-7b.ggml.q5_1.bin +3 -0
- llama-7b.ggml.q8_0.bin +3 -0
README.md
ADDED
@@ -0,0 +1,29 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
inference: false
|
3 |
+
license: other
|
4 |
+
---
|
5 |
+
# LLaMa 7B GGML
|
6 |
+
|
7 |
+
This repo contains GGML format model files for the original LLaMa.
|
8 |
+
|
9 |
+
These files are for CPU (+ CUDA) inference using [llama.cpp](https://github.com/ggerganov/llama.cpp).
|
10 |
+
|
11 |
+
I've uploaded them mostly for my own convenience, allowing me to easily grab them if and when I need them for future testing and comparisons.
|
12 |
+
|
13 |
+
## Provided files
|
14 |
+
|
15 |
+
The following formats are included:
|
16 |
+
* float16
|
17 |
+
* q4_0 - 4-bit
|
18 |
+
* q4_1 - 4-bit
|
19 |
+
* q5_0 - 5-bit
|
20 |
+
* q5_1 - 5-bit
|
21 |
+
* q8_0 - 8-bit
|
22 |
+
|
23 |
+
## THESE FILES REQUIRE LATEST LLAMA.CPP (May 12th 2023 - commit b9fd7ee)!
|
24 |
+
|
25 |
+
llama.cpp recently made a breaking change to its quantisation methods.
|
26 |
+
|
27 |
+
I have quantised the GGML files in this repo with the latest version. Therefore you will require llama.cpp compiled on May 12th or later (commit `b9fd7ee` or later) to use them.
|
28 |
+
|
29 |
+
I will not be providing GGML formats for the older llama.cpp code. They're already uploaded all over HF if you really need them!
|
llama-7b.fp16.ggml.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:666a4bb533b303bdaf89e1b6a3b6f93535d868de31d903afdc20983dc526c847
|
3 |
+
size 13477814912
|
llama-7b.ggml.q4_0.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:417111a40c36bff7ae6c6b3f773ac6efdb1c46584ef1077a1f3404d668e3944f
|
3 |
+
size 4212859520
|
llama-7b.ggml.q4_1.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:0fc3f4925923cafe4681370e863319e8ff8f2d760e6b3f5435b415a407aa8d56
|
3 |
+
size 5055128192
|
llama-7b.ggml.q5_0.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1226673013a28d61acb94d46eeb15d3623bf0f1472a99ecaf0da8076d680fdf8
|
3 |
+
size 4633993856
|
llama-7b.ggml.q5_1.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:72040d380ab1067dc08c28d5f16269453bf1d4d7172c24424d4300d8474b42b6
|
3 |
+
size 5055128192
|
llama-7b.ggml.q8_0.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d3e36532ac11c4a63798ac6ec1471c1dc5a89305c9dec0319dfcb7efc146d001
|
3 |
+
size 7581934208
|