Update README.md
Browse files
README.md
CHANGED
@@ -76,12 +76,23 @@ Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish,
|
|
76 |
* **[2024.10.15]** ApolloMoE repo is published!🎉
|
77 |
|
78 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
79 |
## Architecture
|
80 |
|
81 |
<details>
|
82 |
<summary>Click to view the MoE routing image</summary>
|
83 |
|
84 |
-
![ApolloMoE](
|
85 |
|
86 |
</details>
|
87 |
|
@@ -188,17 +199,17 @@ Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish,
|
|
188 |
<details><summary>Click to expand</summary>
|
189 |
|
190 |
|
191 |
-
We take
|
192 |
1. Download Dataset for project:
|
193 |
|
194 |
```
|
195 |
-
bash 0.download_data.sh
|
196 |
```
|
197 |
|
198 |
-
2. Prepare test and dev for specific model:
|
199 |
|
200 |
|
201 |
-
- Create test data for with special token
|
202 |
|
203 |
```
|
204 |
bash 1.data_process_test&dev.sh
|
@@ -206,23 +217,21 @@ Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish,
|
|
206 |
|
207 |
3. Prepare train data for specific model (Create tokenized data in advance):
|
208 |
|
209 |
-
|
210 |
-
- You can adjust data Training order and Training Epoch in this step
|
211 |
|
|
|
|
|
212 |
```
|
213 |
bash 2.data_process_train.sh
|
214 |
```
|
215 |
-
|
216 |
4. Train the model
|
217 |
|
218 |
-
|
219 |
-
- If you want to train in Multi Nodes please refer to ./
|
220 |
-
|
221 |
-
|
222 |
|
223 |
|
224 |
```
|
225 |
-
bash 3.
|
226 |
```
|
227 |
|
228 |
|
@@ -232,12 +241,6 @@ Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish,
|
|
232 |
bash 4.eval.sh
|
233 |
```
|
234 |
|
235 |
-
6. Evaluate your model: Play with your ckpts in bash
|
236 |
-
|
237 |
-
```
|
238 |
-
python ./src/evaluate/cli_demo.py --model_name='./ckpts/your/path/tfmr'
|
239 |
-
```
|
240 |
-
|
241 |
</details>
|
242 |
|
243 |
|
|
|
76 |
* **[2024.10.15]** ApolloMoE repo is published!🎉
|
77 |
|
78 |
|
79 |
+
## Languages Coverage
|
80 |
+
12 Major Languages and 38 Minor Languages
|
81 |
+
|
82 |
+
<details>
|
83 |
+
<summary>Click to view the Languages Coverage</summary>
|
84 |
+
|
85 |
+
![ApolloMoE](assets/languages.png)
|
86 |
+
|
87 |
+
</details>
|
88 |
+
|
89 |
+
|
90 |
## Architecture
|
91 |
|
92 |
<details>
|
93 |
<summary>Click to view the MoE routing image</summary>
|
94 |
|
95 |
+
![ApolloMoE](assets/hybrid_routing.png)
|
96 |
|
97 |
</details>
|
98 |
|
|
|
199 |
<details><summary>Click to expand</summary>
|
200 |
|
201 |
|
202 |
+
We take Apollo2-7B or Apollo-MoE-0.5B as example
|
203 |
1. Download Dataset for project:
|
204 |
|
205 |
```
|
206 |
+
bash 0.download_data.sh
|
207 |
```
|
208 |
|
209 |
+
2. Prepare test and dev data for specific model:
|
210 |
|
211 |
|
212 |
+
- Create test data for with special token
|
213 |
|
214 |
```
|
215 |
bash 1.data_process_test&dev.sh
|
|
|
217 |
|
218 |
3. Prepare train data for specific model (Create tokenized data in advance):
|
219 |
|
|
|
|
|
220 |
|
221 |
+
- You can adjust data Training order and Training Epoch in this step
|
222 |
+
|
223 |
```
|
224 |
bash 2.data_process_train.sh
|
225 |
```
|
226 |
+
|
227 |
4. Train the model
|
228 |
|
229 |
+
|
230 |
+
- If you want to train in Multi Nodes please refer to ./src/sft/training_config/zero_multi.yaml
|
|
|
|
|
231 |
|
232 |
|
233 |
```
|
234 |
+
bash 3.single_node_train.sh
|
235 |
```
|
236 |
|
237 |
|
|
|
241 |
bash 4.eval.sh
|
242 |
```
|
243 |
|
|
|
|
|
|
|
|
|
|
|
|
|
244 |
</details>
|
245 |
|
246 |
|