NeuralNovel
/

Confinus-2x7B

Text Generation

Mixture of Experts

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

NeuralNovel commited on Mar 5

Commit

15d92e0

•

1 Parent(s): 02bb41f

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -118,7 +118,7 @@ In the boundless sands ..
 A model to test how MoE will route without square expansion.
-# "[What is a Mixture of Experts (MoE)?](https://huggingface.co/blog/moe)"
 ### (from the MistralAI papers...click the quoted question above to navigate to it directly.)
 The scale of a model is one of the most important axes for better model quality. Given a fixed computing budget, training a larger model for fewer steps is better than training a smaller model for more steps.

 A model to test how MoE will route without square expansion.
+## "[What is a Mixture of Experts (MoE)?](https://huggingface.co/blog/moe)"
 ### (from the MistralAI papers...click the quoted question above to navigate to it directly.)
 The scale of a model is one of the most important axes for better model quality. Given a fixed computing budget, training a larger model for fewer steps is better than training a smaller model for more steps.