Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2312.06681

Dynamics of Transformer Language Model Features

Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time

Paper • 2203.05482 • Published Mar 10, 2022 • 6
Diverse Weight Averaging for Out-of-Distribution Generalization

Paper • 2205.09739 • Published May 19, 2022 • 1
Fusing finetuned models for better pretraining

Paper • 2204.03044 • Published Apr 6, 2022 • 5
Sudden Drops in the Loss: Syntax Acquisition, Phase Transitions, and Simplicity Bias in MLMs

Paper • 2309.07311 • Published Sep 13, 2023 • 2

aMUSEd: An Open MUSE Reproduction

Paper • 2401.01808 • Published Jan 3 • 28
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations

Paper • 2401.01885 • Published Jan 3 • 27
SteinDreamer: Variance Reduction for Text-to-3D Score Distillation via Stein Identity

Paper • 2401.00604 • Published Dec 31, 2023 • 4
LARP: Language-Agent Role Play for Open-World Games

Paper • 2312.17653 • Published Dec 24, 2023 • 30

LLM 不需要训练

Steering Llama 2 via Contrastive Activation Addition

Paper • 2312.06681 • Published Dec 9, 2023 • 11

System 2 Attention (is something you might need too)

Paper • 2311.11829 • Published Nov 20, 2023 • 39
TPTU-v2: Boosting Task Planning and Tool Usage of Large Language Model-based Agents in Real-world Systems

Paper • 2311.11315 • Published Nov 19, 2023 • 6
Alignment for Honesty

Paper • 2312.07000 • Published Dec 12, 2023 • 11
Steering Llama 2 via Contrastive Activation Addition

Paper • 2312.06681 • Published Dec 9, 2023 • 11

Efficient Inference

Prompt Cache: Modular Attention Reuse for Low-Latency Inference

Paper • 2311.04934 • Published Nov 7, 2023 • 28
Routing to the Expert: Efficient Reward-guided Ensemble of Large Language Models

Paper • 2311.08692 • Published Nov 15, 2023 • 12
Exponentially Faster Language Modelling

Paper • 2311.10770 • Published Nov 15, 2023 • 118
Memory Augmented Language Models through Mixture of Word Experts

Paper • 2311.10768 • Published Nov 15, 2023 • 16

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs