davelsphere commited on
Commit
94b1bca
1 Parent(s): 3dbc65f

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +152 -0
README.md ADDED
@@ -0,0 +1,152 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: text-generation
3
+ inference: false
4
+ license: apache-2.0
5
+ library_name: transformers
6
+ tags:
7
+ - language
8
+ - granite-3.0
9
+ - llama-cpp
10
+ - gguf-my-repo
11
+ base_model: ibm-granite/granite-3.0-3b-a800m-base
12
+ model-index:
13
+ - name: granite-3.0-3b-a800m-base
14
+ results:
15
+ - task:
16
+ type: text-generation
17
+ dataset:
18
+ name: MMLU
19
+ type: human-exams
20
+ metrics:
21
+ - type: pass@1
22
+ value: 48.64
23
+ name: pass@1
24
+ - type: pass@1
25
+ value: 18.84
26
+ name: pass@1
27
+ - type: pass@1
28
+ value: 23.81
29
+ name: pass@1
30
+ - task:
31
+ type: text-generation
32
+ dataset:
33
+ name: WinoGrande
34
+ type: commonsense
35
+ metrics:
36
+ - type: pass@1
37
+ value: 65.67
38
+ name: pass@1
39
+ - type: pass@1
40
+ value: 42.2
41
+ name: pass@1
42
+ - type: pass@1
43
+ value: 47.39
44
+ name: pass@1
45
+ - type: pass@1
46
+ value: 78.29
47
+ name: pass@1
48
+ - type: pass@1
49
+ value: 72.79
50
+ name: pass@1
51
+ - type: pass@1
52
+ value: 41.34
53
+ name: pass@1
54
+ - task:
55
+ type: text-generation
56
+ dataset:
57
+ name: BoolQ
58
+ type: reading-comprehension
59
+ metrics:
60
+ - type: pass@1
61
+ value: 75.75
62
+ name: pass@1
63
+ - type: pass@1
64
+ value: 20.96
65
+ name: pass@1
66
+ - task:
67
+ type: text-generation
68
+ dataset:
69
+ name: ARC-C
70
+ type: reasoning
71
+ metrics:
72
+ - type: pass@1
73
+ value: 46.84
74
+ name: pass@1
75
+ - type: pass@1
76
+ value: 24.83
77
+ name: pass@1
78
+ - type: pass@1
79
+ value: 38.93
80
+ name: pass@1
81
+ - type: pass@1
82
+ value: 35.05
83
+ name: pass@1
84
+ - task:
85
+ type: text-generation
86
+ dataset:
87
+ name: HumanEval
88
+ type: code
89
+ metrics:
90
+ - type: pass@1
91
+ value: 26.83
92
+ name: pass@1
93
+ - type: pass@1
94
+ value: 34.6
95
+ name: pass@1
96
+ - task:
97
+ type: text-generation
98
+ dataset:
99
+ name: GSM8K
100
+ type: math
101
+ metrics:
102
+ - type: pass@1
103
+ value: 35.86
104
+ name: pass@1
105
+ - type: pass@1
106
+ value: 17.4
107
+ name: pass@1
108
+ ---
109
+
110
+ # davelsphere/granite-3.0-3b-a800m-base-Q4_K_M-GGUF
111
+ This model was converted to GGUF format from [`ibm-granite/granite-3.0-3b-a800m-base`](https://huggingface.co/ibm-granite/granite-3.0-3b-a800m-base) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
112
+ Refer to the [original model card](https://huggingface.co/ibm-granite/granite-3.0-3b-a800m-base) for more details on the model.
113
+
114
+ ## Use with llama.cpp
115
+ Install llama.cpp through brew (works on Mac and Linux)
116
+
117
+ ```bash
118
+ brew install llama.cpp
119
+
120
+ ```
121
+ Invoke the llama.cpp server or the CLI.
122
+
123
+ ### CLI:
124
+ ```bash
125
+ llama-cli --hf-repo davelsphere/granite-3.0-3b-a800m-base-Q4_K_M-GGUF --hf-file granite-3.0-3b-a800m-base-q4_k_m.gguf -p "The meaning to life and the universe is"
126
+ ```
127
+
128
+ ### Server:
129
+ ```bash
130
+ llama-server --hf-repo davelsphere/granite-3.0-3b-a800m-base-Q4_K_M-GGUF --hf-file granite-3.0-3b-a800m-base-q4_k_m.gguf -c 2048
131
+ ```
132
+
133
+ Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
134
+
135
+ Step 1: Clone llama.cpp from GitHub.
136
+ ```
137
+ git clone https://github.com/ggerganov/llama.cpp
138
+ ```
139
+
140
+ Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
141
+ ```
142
+ cd llama.cpp && LLAMA_CURL=1 make
143
+ ```
144
+
145
+ Step 3: Run inference through the main binary.
146
+ ```
147
+ ./llama-cli --hf-repo davelsphere/granite-3.0-3b-a800m-base-Q4_K_M-GGUF --hf-file granite-3.0-3b-a800m-base-q4_k_m.gguf -p "The meaning to life and the universe is"
148
+ ```
149
+ or
150
+ ```
151
+ ./llama-server --hf-repo davelsphere/granite-3.0-3b-a800m-base-Q4_K_M-GGUF --hf-file granite-3.0-3b-a800m-base-q4_k_m.gguf -c 2048
152
+ ```