NeuralNovel commited on
Commit
02bb41f
1 Parent(s): 47ddbf3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -4
README.md CHANGED
@@ -116,10 +116,6 @@ model-index:
116
 
117
  In the boundless sands ..
118
 
119
- [Join our Discord!](https://discord.gg/rJXGjmxqzS)
120
-
121
- <a href='https://ko-fi.com/S6S2UH2TC' target='_blank'><img height='36' style='border:0px;height:36px;' src='https://storage.ko-fi.com/cdn/kofi1.png?v=3' border='0' alt='Buy Me a Coffee at ko-fi.com' /></a>
122
-
123
  A model to test how MoE will route without square expansion.
124
 
125
  # "[What is a Mixture of Experts (MoE)?](https://huggingface.co/blog/moe)"
@@ -139,6 +135,11 @@ At every layer, for every token, a router network chooses two of these groups (t
139
 
140
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6589d7e6586088fd2784a12c/up_I0R2TQGjqTShZp_1Sz.png)
141
 
 
 
 
 
 
142
  Switch Layer
143
  MoE layer from the [Switch Transformers paper](https://arxiv.org/abs/2101.03961)
144
 
 
116
 
117
  In the boundless sands ..
118
 
 
 
 
 
119
  A model to test how MoE will route without square expansion.
120
 
121
  # "[What is a Mixture of Experts (MoE)?](https://huggingface.co/blog/moe)"
 
135
 
136
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6589d7e6586088fd2784a12c/up_I0R2TQGjqTShZp_1Sz.png)
137
 
138
+
139
+ [Join our Discord!](https://discord.gg/rJXGjmxqzS)
140
+
141
+ <a href='https://ko-fi.com/S6S2UH2TC' target='_blank'><img height='36' style='border:0px;height:36px;' src='https://storage.ko-fi.com/cdn/kofi1.png?v=3' border='0' alt='Buy Me a Coffee at ko-fi.com' /></a>
142
+
143
  Switch Layer
144
  MoE layer from the [Switch Transformers paper](https://arxiv.org/abs/2101.03961)
145