gräfics
graefics
·
AI & ML interests
None yet
Organizations
graefics's activity
Any plans to use RMSNorm (or FlashNorm) instead of LayerNorm?
1
#12 opened about 2 months ago
by
graefics
Are there two identical embedding tensors, even though embeddings are shared?
#15 opened about 2 months ago
by
graefics
Any plans to use MQA (multi-query attention) or GQA (grouped-query attention) in the future?
#9 opened 7 months ago
by
graefics