Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
RichardForests
's Collections
Language Models
CV
RL
Diffusion models
3D/4D Gaussian Splatting
Multimodal
Mamba
NeRF
Transformers & MoE
(3D) Foundation Models
SSL
DL & Software DStructures
Gemma & MoE
Dora
Flash Attention in Triton
Lora variations
Parameter Efficient - LLMs
Robotics - Cross Attention
LLM Agents OS
DMs - Lighting Conditions
Gemma & MoE
updated
Apr 24
Upvote
-
Crystalcareai/GemMoE-Beta-1
Text Generation
•
Updated
Mar 20
•
130
•
79
JetMoE: Reaching Llama2 Performance with 0.1M Dollars
Paper
•
2404.07413
•
Published
Apr 11
•
36
Multi-Head Mixture-of-Experts
Paper
•
2404.15045
•
Published
Apr 23
•
59
Upvote
-
Share collection
View history
Collection guide
Browse collections