AsirAsir commited on
Commit
8e199ca
1 Parent(s): bd92b4f

Upload 2 files

Browse files
Files changed (2) hide show
  1. README.md +16 -15
  2. README_zh.md +47 -0
README.md CHANGED
@@ -9,24 +9,24 @@ license_link: LICENSE
9
  </h1>
10
  </div>
11
 
12
- ## 模型介绍
13
 
14
- 我们很高兴首次发布Index系列模型中的轻量版本:Index-1.9B系列
15
- 本次开源的Index-1.9B 系列包含以下模型:
16
- - **Index-1.9B base(本仓库模型)** : 基座模型,具有 19亿 非词嵌入参数量,在2.8T 中英文为主的语料上预训练,多个评测基准上与同级别模型比处于领先.
17
- - Index-1.9B pure : 基座模型的对照组,与base具有相同的参数和训练策略,不同之处在于我们严格过滤了该版本语料中所有指令相关的数据,以此来验证指令对benchmark的影响
18
- - Index-1.9B chat: 基于index-1.9B base通过SFT和DPO对齐后的对话模型,我们发现由于我们预训练中引入了较多互联网社区语料,聊天的趣味性明显更强
19
- - Index-1.9B character : SFTDPO的基础上引入了RAG来实现fewshots角色扮演定制
20
 
21
- **注意:此为Base模型,仅能续写,以及进一步的训练对齐,不能直接交互。**
22
- - **Chat模型**详见 [Index-1.9B-Chat](https://huggingface.co/IndexTeam/Index-1.9B-Chat)
23
- - **角色扮演模型**详见 [Index-1.9B-Character](https://huggingface.co/IndexTeam/Index-1.9B-Character)
24
 
25
- 更多细节详见我们的[GitHub](https://github.com/bilibili/Index-1.9B)[Index-1.9B技术报告](https://github.com/bilibili/Index-1.9B/blob/main/Index-1.9B%20%E6%8A%80%E6%9C%AF%E6%8A%A5%E5%91%8A.pdf)
26
 
27
- ## 评测结果
28
- 对通用理解进行评测,Index-1.9B性能优秀,于近期开源的端侧小模型相比领先,并可以和一批7B和大于10B的模型相比较
29
- |模型|均分|英文均分|MMLU|CEVAL|CMMLU|HellaSwag|Arc-C|Arc-E|
30
  |----|----|----|----|----|----|----|----|----|
31
  |Google Gemma 2B|41.58|46.77|41.81|31.36|31.02|66.82|36.39|42.07|
32
  |Phi-2 (2.7B)|58.89|**72.54**|57.61|31.12|32.05|70.94|74.51|87.1|
@@ -43,5 +43,6 @@ license_link: LICENSE
43
  |MPT-30B (report)|/|63.48|46.9|/|/|79.9|50.6|76.5|
44
  |Falcon-40B (report)|/|68.18|55.4|/|/|83.6|54.5|79.2|
45
 
46
- 评测代码基于[OpenCompass](https://github.com/open-compass/opencompass), 并做了适配性修改,详见[evaluate代码](https://github.com/bilibili/Index-1.9B/evaluate/)
 
47
 
 
9
  </h1>
10
  </div>
11
 
12
+ ## Model Introduction
13
 
14
+ We are excited to announce the release of a lightweight version from the Index series models: the Index-1.9B series.
15
+ The open-source Index-1.9B series includes the following models:
16
+ - **Index-1.9B base (this repository's model)** : The base model, with 1.9 billion non-embedding parameters, pre-trained on a 2.8T corpus mainly in Chinese and English. It leads in multiple evaluation benchmarks compared to models of the same level.
17
+ - Index-1.9B pure : A control version of the base model with the same parameters and training strategy, but strictly filtered out all instruction-related data from the corpus to verify the impact of instructions on benchmarks.
18
+ - Index-1.9B chat: A dialogue model aligned with SFT and DPO based on the Index-1.9B base. We found that due to the introduction of a lot of internet community corpus in our pre-training, the model has significantly more interesting chatting capabilities.
19
+ - Index-1.9B character : Introduces RAG on top of SFT and DPO to achieve few-shots role-playing customization.
20
 
21
+ **Note: This is the Base model, capable only of continuation and further training alignment, and cannot be directly interacted with.**
22
+ - For the **Chat model**, see [Index-1.9B-Chat](https://huggingface.co/IndexTeam/Index-1.9B-Chat)
23
+ - For the **Role-playing model**, see [Index-1.9B-Character](https://huggingface.co/IndexTeam/Index-1.9B-Character)
24
 
25
+ For more details, see our [GitHub](https://github.com/bilibili/Index-1.9B) and [Index-1.9B Technical Report](https://github.com/bilibili/Index-1.9B/blob/main/Index-1.9B%20%E6%8A%80%E6%9C%AF%E6%8A%A5%E5%91%8A.pdf)
26
 
27
+ ## Evaluation Results
28
+ The Index-1.9B shows excellent performance in general understanding evaluations, leading compared to recently open-sourced small models and comparable to some 7B and models larger than 10B.
29
+ |Model|Average score|Average English score|MMLU|CEVAL|CMMLU|HellaSwag|Arc-C|Arc-E|
30
  |----|----|----|----|----|----|----|----|----|
31
  |Google Gemma 2B|41.58|46.77|41.81|31.36|31.02|66.82|36.39|42.07|
32
  |Phi-2 (2.7B)|58.89|**72.54**|57.61|31.12|32.05|70.94|74.51|87.1|
 
43
  |MPT-30B (report)|/|63.48|46.9|/|/|79.9|50.6|76.5|
44
  |Falcon-40B (report)|/|68.18|55.4|/|/|83.6|54.5|79.2|
45
 
46
+ Evaluation code is based on [OpenCompass](https://github.com/open-compass/opencompass) with compatibility modifications. See the [evaluate](./evaluate/) folder for details.
47
+
48
 
README_zh.md ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: license
4
+ license_link: LICENSE
5
+ ---
6
+ <div align="center">
7
+ <h1>
8
+ Index-1.9B
9
+ </h1>
10
+ </div>
11
+
12
+ ## 模型介绍
13
+
14
+ 我们很高兴首次发布Index系列模型中的轻量版本:Index-1.9B系列
15
+ 本次开源的Index-1.9B 系列包含以下模型:
16
+ - **Index-1.9B base(本仓库模型)** : 基座模型,具有 19亿 非词嵌入参数量,在2.8T 中英文为主的语料上预训练,多个评测基准上与同级别模型比处于领先.
17
+ - Index-1.9B pure : 基座模型的对照组,与base具有相同的参数和训练策略,不同之处在于我们严格过滤了该版本语料中所有指令相关的数据,以此来验证指令对benchmark的影响
18
+ - Index-1.9B chat: 基于index-1.9B base通过SFT和DPO对齐后的对话模型,我们发现由于我们预训练中引入了较多互联网社区语料,聊天的趣味性明显更强
19
+ - Index-1.9B character : 在SFT和DPO的基础上引入了RAG来实现fewshots角色扮演定制
20
+
21
+ **注意:此为Base模型,仅能续写,以及进一步的训练对齐,不能直接交互。**
22
+ - **Chat模型**详见 [Index-1.9B-Chat](https://huggingface.co/IndexTeam/Index-1.9B-Chat)
23
+ - **角色扮演模型**详见 [Index-1.9B-Character](https://huggingface.co/IndexTeam/Index-1.9B-Character)
24
+
25
+ 更多细节详见我们的[GitHub](https://github.com/bilibili/Index-1.9B)和[Index-1.9B技术报告](https://github.com/bilibili/Index-1.9B/blob/main/Index-1.9B%20%E6%8A%80%E6%9C%AF%E6%8A%A5%E5%91%8A.pdf)
26
+
27
+ ## 评测结果
28
+ 对通用理解进行评测,Index-1.9B性能优秀,于近期开源的端侧小模型相比领先,并可以和一批7B和大于10B的模型相比较
29
+ |模型|均分|英文均分|MMLU|CEVAL|CMMLU|HellaSwag|Arc-C|Arc-E|
30
+ |----|----|----|----|----|----|----|----|----|
31
+ |Google Gemma 2B|41.58|46.77|41.81|31.36|31.02|66.82|36.39|42.07|
32
+ |Phi-2 (2.7B)|58.89|**72.54**|57.61|31.12|32.05|70.94|74.51|87.1|
33
+ |Qwen1.5-1.8B|58.96|59.28|47.05|59.48|57.12|58.33|56.82|74.93|
34
+ |Qwen2-1.5B(report)|**65.17**|62.52 |56.5|70.6|70.3|66.6|43.9|83.09|
35
+ |MiniCPM-2.4B-SFT|62.53|68.75|53.8|49.19|50.97|67.29|69.44|84.48|
36
+ |**Index-1.9B-Pure**|49.55 |52.83 |43.75|42.35|43.61|63.21|42.75|61.61|
37
+ |**Index-1.9B**|**64.92** |**69.93**|52.53|57.01|52.79|80.69|65.15|81.35|
38
+ |Llama2-7B|50.79|60.31|44.32|32.42|31.11|76|46.3|74.6|
39
+ |Mistral-7B (report) |/|**69.23**|60.1|/|/|81.3|55.5|80|
40
+ |Baichuan2-7B|54.53|53.51|54.64|56.19|56.95|25.04|57.25|77.12|
41
+ |Llama2-13B|57.51|66.61|55.78|39.93|38.7|76.22|58.88|75.56|
42
+ |Baichuan2-13B|68.90|71.69|59.63|59.21|61.27|72.61|70.04|84.48|
43
+ |MPT-30B (report)|/|63.48|46.9|/|/|79.9|50.6|76.5|
44
+ |Falcon-40B (report)|/|68.18|55.4|/|/|83.6|54.5|79.2|
45
+
46
+ 评测代码基于[OpenCompass](https://github.com/open-compass/opencompass), 并做了适配性修改,详见[evaluate代码](https://github.com/bilibili/Index-1.9B/evaluate/)
47
+