chestnutlzj's picture
Update README.md
0ea1a63
|
raw
history blame
1.18 kB

4x1.8B MoE Qwen Ckpt 18000

This is a MoE model project constructed based on the Qwen 1.8B model. In this project, we concatenated 4 original models and trained them using special training methods.

This model is a checkpoint model for the continue pretraining stage.

Evaluations

Groups Metric Value Stderr
boolq acc 0.6502 ± 0.0083
ceval-valid acc 0.5171 ± 0.1872
acc_norm 0.5171 ± 0.1872
cmmlu acc 0.5041 ± 0.1222
acc_norm 0.5041 ± 0.1222
mathqa acc 0.2693 ± 0.0081
acc_norm 0.2693 ± 0.0081

Acknowledgements

License Agreement

This project is open source under the Tongyi Qianwen Research License Agreement. You can view the complete license agreement in this link: (LICENCE)[https://github.com/QwenLM/Qwen/blob/main/Tongyi%20Qianwen%20RESEARCH%20LICENSE%20AGREEMENT].

During the use of this project, please ensure that your usage behavior complies with the terms and conditions of the license agreement.