NeuralNovel
commited on
Commit
•
02bb41f
1
Parent(s):
47ddbf3
Update README.md
Browse files
README.md
CHANGED
@@ -116,10 +116,6 @@ model-index:
|
|
116 |
|
117 |
In the boundless sands ..
|
118 |
|
119 |
-
[Join our Discord!](https://discord.gg/rJXGjmxqzS)
|
120 |
-
|
121 |
-
<a href='https://ko-fi.com/S6S2UH2TC' target='_blank'><img height='36' style='border:0px;height:36px;' src='https://storage.ko-fi.com/cdn/kofi1.png?v=3' border='0' alt='Buy Me a Coffee at ko-fi.com' /></a>
|
122 |
-
|
123 |
A model to test how MoE will route without square expansion.
|
124 |
|
125 |
# "[What is a Mixture of Experts (MoE)?](https://huggingface.co/blog/moe)"
|
@@ -139,6 +135,11 @@ At every layer, for every token, a router network chooses two of these groups (t
|
|
139 |
|
140 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6589d7e6586088fd2784a12c/up_I0R2TQGjqTShZp_1Sz.png)
|
141 |
|
|
|
|
|
|
|
|
|
|
|
142 |
Switch Layer
|
143 |
MoE layer from the [Switch Transformers paper](https://arxiv.org/abs/2101.03961)
|
144 |
|
|
|
116 |
|
117 |
In the boundless sands ..
|
118 |
|
|
|
|
|
|
|
|
|
119 |
A model to test how MoE will route without square expansion.
|
120 |
|
121 |
# "[What is a Mixture of Experts (MoE)?](https://huggingface.co/blog/moe)"
|
|
|
135 |
|
136 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6589d7e6586088fd2784a12c/up_I0R2TQGjqTShZp_1Sz.png)
|
137 |
|
138 |
+
|
139 |
+
[Join our Discord!](https://discord.gg/rJXGjmxqzS)
|
140 |
+
|
141 |
+
<a href='https://ko-fi.com/S6S2UH2TC' target='_blank'><img height='36' style='border:0px;height:36px;' src='https://storage.ko-fi.com/cdn/kofi1.png?v=3' border='0' alt='Buy Me a Coffee at ko-fi.com' /></a>
|
142 |
+
|
143 |
Switch Layer
|
144 |
MoE layer from the [Switch Transformers paper](https://arxiv.org/abs/2101.03961)
|
145 |
|