# 4x1.8B MoE Qwen Ckpt 18000

This is a MoE model project constructed based on the Qwen 1.8B model. In this project, we concatenated 4 original models and trained them using special training methods.

This model is a checkpoint model for the continue pretraining stage.

![](loss_plot.png)

# Evaluations

|  Groups   | Metric |Value |   |Stderr|
|-----------|--------|-----:|---|-----:|
|boolq      |acc     |0.6502|±  |0.0083|
|ceval-valid|acc     |0.5171|±  |0.1872|
|           |acc_norm|0.5171|±  |0.1872|
|cmmlu      |acc     |0.5041|±  |0.1222|
|           |acc_norm|0.5041|±  |0.1222|
|mathqa     |acc     |0.2693|±  |0.0081|
|           |acc_norm|0.2693|±  |0.0081|

# Acknowledgements

+ [Qwen](https://github.com/QwenLM/Qwen)
+ [mistral.ai](https://mistral.ai)

# License Agreement

This project is open source under the Tongyi Qianwen Research License Agreement. You can view the complete license agreement in this link: [LICENSE](https://github.com/QwenLM/Qwen/blob/main/Tongyi%20Qianwen%20RESEARCH%20LICENSE%20AGREEMENT).

During the use of this project, please ensure that your usage behavior complies with the terms and conditions of the license agreement.