updates: eval results and github repo link
Browse files
README.md
CHANGED
@@ -30,7 +30,7 @@ model-index:
|
|
30 |
metrics:
|
31 |
- name: pass@1
|
32 |
type: pass@1
|
33 |
-
value: 23.
|
34 |
veriefied: false
|
35 |
- task:
|
36 |
type: text-generation
|
@@ -40,7 +40,7 @@ model-index:
|
|
40 |
metrics:
|
41 |
- name: pass@1
|
42 |
type: pass@1
|
43 |
-
value:
|
44 |
veriefied: false
|
45 |
- task:
|
46 |
type: text-generation
|
@@ -50,7 +50,7 @@ model-index:
|
|
50 |
metrics:
|
51 |
- name: pass@1
|
52 |
type: pass@1
|
53 |
-
value:
|
54 |
veriefied: false
|
55 |
- task:
|
56 |
type: text-generation
|
@@ -60,7 +60,7 @@ model-index:
|
|
60 |
metrics:
|
61 |
- name: pass@1
|
62 |
type: pass@1
|
63 |
-
value:
|
64 |
veriefied: false
|
65 |
- task:
|
66 |
type: text-generation
|
@@ -90,7 +90,7 @@ model-index:
|
|
90 |
metrics:
|
91 |
- name: pass@1
|
92 |
type: pass@1
|
93 |
-
value:
|
94 |
veriefied: false
|
95 |
- task:
|
96 |
type: text-generation
|
@@ -130,7 +130,7 @@ model-index:
|
|
130 |
metrics:
|
131 |
- name: pass@1
|
132 |
type: pass@1
|
133 |
-
value:
|
134 |
veriefied: false
|
135 |
- task:
|
136 |
type: text-generation
|
@@ -140,7 +140,7 @@ model-index:
|
|
140 |
metrics:
|
141 |
- name: pass@1
|
142 |
type: pass@1
|
143 |
-
value:
|
144 |
veriefied: false
|
145 |
- task:
|
146 |
type: text-generation
|
@@ -150,7 +150,17 @@ model-index:
|
|
150 |
metrics:
|
151 |
- name: pass@1
|
152 |
type: pass@1
|
153 |
-
value:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
154 |
veriefied: false
|
155 |
- task:
|
156 |
type: text-generation
|
@@ -160,7 +170,7 @@ model-index:
|
|
160 |
metrics:
|
161 |
- name: pass@1
|
162 |
type: pass@1
|
163 |
-
value:
|
164 |
veriefied: false
|
165 |
- task:
|
166 |
type: text-generation
|
@@ -180,7 +190,7 @@ model-index:
|
|
180 |
metrics:
|
181 |
- name: pass@1
|
182 |
type: pass@1
|
183 |
-
value:
|
184 |
veriefied: false
|
185 |
- task:
|
186 |
type: text-generation
|
@@ -191,19 +201,11 @@ model-index:
|
|
191 |
- name: pass@1
|
192 |
type: pass@1
|
193 |
value: 19.46
|
194 |
-
veriefied: false
|
195 |
-
- task:
|
196 |
-
type: text-generation
|
197 |
-
dataset:
|
198 |
-
type: multilingual
|
199 |
-
name: MGSM
|
200 |
-
metrics:
|
201 |
-
- name: pass@1
|
202 |
-
type: pass@1
|
203 |
-
value: 30.47
|
204 |
veriefied: false
|
205 |
---
|
|
|
206 |
<!-- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62cd5057674cdb524450093d/1hzxoPwqkBJXshKVVe6_9.png) -->
|
|
|
207 |
|
208 |
# Granite-3.0-2B-Base
|
209 |
|
@@ -211,14 +213,14 @@ model-index:
|
|
211 |
**Granite-3.0-2B-Base** is an open-source decoder-only language model from IBM Research that supports a variety of text-to-text generation tasks (e.g., question-answering, text-completion). **Granite-3.0-2B-Base** is trained from scratch and follows a two-phase training strategy. In the first phase, it is trained on 10 trillion tokens sourced from diverse domains. During the second phase, it is further trained on 2 trillion tokens using a carefully curated mix of high-quality data, aiming to enhance its performance on specific tasks.
|
212 |
|
213 |
- **Developers:** IBM Research
|
214 |
-
- **GitHub Repository:** [ibm-granite/granite-language-models](https://github.com/ibm-granite/granite-language-models)
|
215 |
- **Website**: [Granite Docs](https://www.ibm.com/granite/docs/)
|
216 |
-
- **Paper:** [Granite Language Models](
|
217 |
- **Release Date**: October 21st, 2024
|
218 |
-
- **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
|
219 |
|
220 |
## Supported Languages
|
221 |
-
English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, Chinese
|
222 |
|
223 |
## Usage
|
224 |
### Intended use
|
@@ -300,4 +302,4 @@ The use of Large Language Models involves risks and ethical considerations peopl
|
|
300 |
year = {2024},
|
301 |
url = {https://arxiv.org/abs/0000.00000},
|
302 |
}
|
303 |
-
```
|
|
|
30 |
metrics:
|
31 |
- name: pass@1
|
32 |
type: pass@1
|
33 |
+
value: 23.79
|
34 |
veriefied: false
|
35 |
- task:
|
36 |
type: text-generation
|
|
|
40 |
metrics:
|
41 |
- name: pass@1
|
42 |
type: pass@1
|
43 |
+
value: 22.56
|
44 |
veriefied: false
|
45 |
- task:
|
46 |
type: text-generation
|
|
|
50 |
metrics:
|
51 |
- name: pass@1
|
52 |
type: pass@1
|
53 |
+
value: 74.90
|
54 |
veriefied: false
|
55 |
- task:
|
56 |
type: text-generation
|
|
|
60 |
metrics:
|
61 |
- name: pass@1
|
62 |
type: pass@1
|
63 |
+
value: 43.00
|
64 |
veriefied: false
|
65 |
- task:
|
66 |
type: text-generation
|
|
|
90 |
metrics:
|
91 |
- name: pass@1
|
92 |
type: pass@1
|
93 |
+
value: 77.65
|
94 |
veriefied: false
|
95 |
- task:
|
96 |
type: text-generation
|
|
|
130 |
metrics:
|
131 |
- name: pass@1
|
132 |
type: pass@1
|
133 |
+
value: 54.27
|
134 |
veriefied: false
|
135 |
- task:
|
136 |
type: text-generation
|
|
|
140 |
metrics:
|
141 |
- name: pass@1
|
142 |
type: pass@1
|
143 |
+
value: 30.58
|
144 |
veriefied: false
|
145 |
- task:
|
146 |
type: text-generation
|
|
|
150 |
metrics:
|
151 |
- name: pass@1
|
152 |
type: pass@1
|
153 |
+
value: 40.69
|
154 |
+
veriefied: false
|
155 |
+
- task:
|
156 |
+
type: text-generation
|
157 |
+
dataset:
|
158 |
+
type: reasoning
|
159 |
+
name: MUSR
|
160 |
+
metrics:
|
161 |
+
- name: pass@1
|
162 |
+
type: pass@1
|
163 |
+
value: 34.34
|
164 |
veriefied: false
|
165 |
- task:
|
166 |
type: text-generation
|
|
|
170 |
metrics:
|
171 |
- name: pass@1
|
172 |
type: pass@1
|
173 |
+
value: 38.41
|
174 |
veriefied: false
|
175 |
- task:
|
176 |
type: text-generation
|
|
|
190 |
metrics:
|
191 |
- name: pass@1
|
192 |
type: pass@1
|
193 |
+
value: 47.23
|
194 |
veriefied: false
|
195 |
- task:
|
196 |
type: text-generation
|
|
|
201 |
- name: pass@1
|
202 |
type: pass@1
|
203 |
value: 19.46
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
204 |
veriefied: false
|
205 |
---
|
206 |
+
> IMPORTANT: This model card is an early draft, the final version will available in Hugging Face on Oct 21st, 2024
|
207 |
<!-- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62cd5057674cdb524450093d/1hzxoPwqkBJXshKVVe6_9.png) -->
|
208 |
+
![image/png](granite-3_0-language-models_Group_1.png)
|
209 |
|
210 |
# Granite-3.0-2B-Base
|
211 |
|
|
|
213 |
**Granite-3.0-2B-Base** is an open-source decoder-only language model from IBM Research that supports a variety of text-to-text generation tasks (e.g., question-answering, text-completion). **Granite-3.0-2B-Base** is trained from scratch and follows a two-phase training strategy. In the first phase, it is trained on 10 trillion tokens sourced from diverse domains. During the second phase, it is further trained on 2 trillion tokens using a carefully curated mix of high-quality data, aiming to enhance its performance on specific tasks.
|
214 |
|
215 |
- **Developers:** IBM Research
|
216 |
+
- **GitHub Repository:** [ibm-granite/granite-3.0-language-models](https://github.com/ibm-granite/granite-3.0-language-models)
|
217 |
- **Website**: [Granite Docs](https://www.ibm.com/granite/docs/)
|
218 |
+
- **Paper:** [Granite 3.0 Language Models]()
|
219 |
- **Release Date**: October 21st, 2024
|
220 |
+
- **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
|
221 |
|
222 |
## Supported Languages
|
223 |
+
English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, Chinese
|
224 |
|
225 |
## Usage
|
226 |
### Intended use
|
|
|
302 |
year = {2024},
|
303 |
url = {https://arxiv.org/abs/0000.00000},
|
304 |
}
|
305 |
+
```
|