Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2311.05884

Hiformer: Heterogeneous Feature Interactions Learning with Transformers for Recommender Systems

Paper • 2311.05884 • Published Nov 10, 2023 • 5

Attention or Convolution: Transformer Encoders in Audio Language Models for Inference Efficiency

Paper • 2311.02772 • Published Nov 5, 2023 • 3
Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities

Paper • 2311.05698 • Published Nov 9, 2023 • 9
Hiformer: Heterogeneous Feature Interactions Learning with Transformers for Recommender Systems

Paper • 2311.05884 • Published Nov 10, 2023 • 5
Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers

Paper • 2311.10642 • Published Nov 17, 2023 • 23

A Bi-Step Grounding Paradigm for Large Language Models in Recommendation Systems

Paper • 2308.08434 • Published Aug 16, 2023 • 1
Large Language Models for Generative Recommendation: A Survey and Visionary Discussions

Paper • 2309.01157 • Published Sep 3, 2023 • 1
LLM-Rec: Personalized Recommendation via Prompting Large Language Models

Paper • 2307.15780 • Published Jul 24, 2023 • 24
Leveraging Large Language Models for Pre-trained Recommender Systems

Paper • 2308.10837 • Published Aug 21, 2023 • 1

LLM architecture

The Impact of Depth and Width on Transformer Language Model Generalization

Paper • 2310.19956 • Published Oct 30, 2023 • 9
Retentive Network: A Successor to Transformer for Large Language Models

Paper • 2307.08621 • Published Jul 17, 2023 • 170
RWKV: Reinventing RNNs for the Transformer Era

Paper • 2305.13048 • Published May 22, 2023 • 14
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 44

Efficient Memory Management for Large Language Model Serving with PagedAttention

Paper • 2309.06180 • Published Sep 12, 2023 • 25
LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models

Paper • 2308.16137 • Published Aug 30, 2023 • 39
Scaling Transformer to 1M tokens and beyond with RMT

Paper • 2304.11062 • Published Apr 19, 2023 • 2
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models

Paper • 2309.14509 • Published Sep 25, 2023 • 17

Eureka: Human-Level Reward Design via Coding Large Language Models

Paper • 2310.12931 • Published Oct 19, 2023 • 26
GENOME: GenerativE Neuro-symbOlic visual reasoning by growing and reusing ModulEs

Paper • 2311.04901 • Published Nov 8, 2023 • 7
Hiformer: Heterogeneous Feature Interactions Learning with Transformers for Recommender Systems

Paper • 2311.05884 • Published Nov 10, 2023 • 5
PolyMaX: General Dense Prediction with Mask Transformer

Paper • 2311.05770 • Published Nov 9, 2023 • 6

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs