NeuralNovel
commited on
Commit
•
15d92e0
1
Parent(s):
02bb41f
Update README.md
Browse files
README.md
CHANGED
@@ -118,7 +118,7 @@ In the boundless sands ..
|
|
118 |
|
119 |
A model to test how MoE will route without square expansion.
|
120 |
|
121 |
-
|
122 |
### (from the MistralAI papers...click the quoted question above to navigate to it directly.)
|
123 |
|
124 |
The scale of a model is one of the most important axes for better model quality. Given a fixed computing budget, training a larger model for fewer steps is better than training a smaller model for more steps.
|
|
|
118 |
|
119 |
A model to test how MoE will route without square expansion.
|
120 |
|
121 |
+
## "[What is a Mixture of Experts (MoE)?](https://huggingface.co/blog/moe)"
|
122 |
### (from the MistralAI papers...click the quoted question above to navigate to it directly.)
|
123 |
|
124 |
The scale of a model is one of the most important axes for better model quality. Given a fixed computing budget, training a larger model for fewer steps is better than training a smaller model for more steps.
|