4x1.8B MoE Qwen Ckpt 18000
This is a MoE model project constructed based on the Qwen 1.8B model. In this project, we concatenated 4 original models and trained them using special training methods.
This model is a checkpoint model for the continue pretraining stage.
Evaluations
Groups | Metric | Value | Stderr | |
---|---|---|---|---|
boolq | acc | 0.6502 | ± | 0.0083 |
ceval-valid | acc | 0.5171 | ± | 0.1872 |
acc_norm | 0.5171 | ± | 0.1872 | |
cmmlu | acc | 0.5041 | ± | 0.1222 |
acc_norm | 0.5041 | ± | 0.1222 | |
mathqa | acc | 0.2693 | ± | 0.0081 |
acc_norm | 0.2693 | ± | 0.0081 |
Acknowledgements
License Agreement
This project is open source under the Tongyi Qianwen Research License Agreement. You can view the complete license agreement in this link: LICENSE.
During the use of this project, please ensure that your usage behavior complies with the terms and conditions of the license agreement.
- Downloads last month
- 17
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.