Index-1.9B
Model Introduction
We are excited to announce the release of a lightweight version from the Index series models: the Index-1.9B series. The open-source Index-1.9B series includes the following models:
- Index-1.9B base: The base model, with 1.9 billion non-embedding parameters, pre-trained on a 2.8T corpus mainly in Chinese and English. It leads in multiple evaluation benchmarks compared to models of the same level.
- Index-1.9B pure (this repository's model) : A control version of the base model with the same parameters and training strategy, but strictly filtered out all instruction-related data from the corpus to verify the impact of instructions on benchmarks.
- Index-1.9B chat: A dialogue model aligned with SFT and DPO based on the Index-1.9B base. We found that due to the introduction of a lot of internet community corpus in our pre-training, the model has significantly more interesting chatting capabilities.
- Index-1.9B character : Introduces RAG on top of SFT and DPO to achieve few-shots role-playing customization.
Note: This is the Base model, capable only of continuation and further training alignment, and cannot be directly interacted with.
- For the Chat model, see Index-1.9B-Chat
- For the Role-playing model, see Index-1.9B-Character
For more details, see our GitHub and Index-1.9B Technical Report
Evaluation Results
The Index-1.9B shows excellent performance in general understanding evaluations, leading compared to recently open-sourced small models and comparable to some 7B and models larger than 10B.
Model | Average score | Average English score | MMLU | CEVAL | CMMLU | HellaSwag | Arc-C | Arc-E |
---|---|---|---|---|---|---|---|---|
Google Gemma 2B | 41.58 | 46.77 | 41.81 | 31.36 | 31.02 | 66.82 | 36.39 | 42.07 |
Phi-2 (2.7B) | 58.89 | 72.54 | 57.61 | 31.12 | 32.05 | 70.94 | 74.51 | 87.1 |
Qwen1.5-1.8B | 58.96 | 59.28 | 47.05 | 59.48 | 57.12 | 58.33 | 56.82 | 74.93 |
Qwen2-1.5B(report) | 65.17 | 62.52 | 56.5 | 70.6 | 70.3 | 66.6 | 43.9 | 83.09 |
MiniCPM-2.4B-SFT | 62.53 | 68.75 | 53.8 | 49.19 | 50.97 | 67.29 | 69.44 | 84.48 |
Index-1.9B-Pure | 49.55 | 52.83 | 43.75 | 42.35 | 43.61 | 63.21 | 42.75 | 61.61 |
Index-1.9B | 64.92 | 69.93 | 52.53 | 57.01 | 52.79 | 80.69 | 65.15 | 81.35 |
Llama2-7B | 50.79 | 60.31 | 44.32 | 32.42 | 31.11 | 76 | 46.3 | 74.6 |
Mistral-7B (report) | / | 69.23 | 60.1 | / | / | 81.3 | 55.5 | 80 |
Baichuan2-7B | 54.53 | 53.51 | 54.64 | 56.19 | 56.95 | 25.04 | 57.25 | 77.12 |
Llama2-13B | 57.51 | 66.61 | 55.78 | 39.93 | 38.7 | 76.22 | 58.88 | 75.56 |
Baichuan2-13B | 68.90 | 71.69 | 59.63 | 59.21 | 61.27 | 72.61 | 70.04 | 84.48 |
MPT-30B (report) | / | 63.48 | 46.9 | / | / | 79.9 | 50.6 | 76.5 |
Falcon-40B (report) | / | 68.18 | 55.4 | / | / | 83.6 | 54.5 | 79.2 |
Evaluation code is based on OpenCompass with compatibility modifications. See the evaluate folder for details.
- Downloads last month
- 11
Inference API (serverless) does not yet support model repos that contain custom code.