Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2401.08541

A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions

Paper • 2312.08578 • Published Dec 14, 2023 • 16
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks

Paper • 2312.08583 • Published Dec 14, 2023 • 9
Vision-Language Models as a Source of Rewards

Paper • 2312.09187 • Published Dec 14, 2023 • 11
StemGen: A music generation model that listens

Paper • 2312.08723 • Published Dec 14, 2023 • 47

Img Gen Foundational

Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction

Paper • 2404.02905 • Published Apr 3 • 64
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Paper • 2403.03206 • Published Mar 5 • 57
Scalable Diffusion Models with Transformers

Paper • 2212.09748 • Published Dec 19, 2022 • 16
Scalable Pre-training of Large Autoregressive Image Models

Paper • 2401.08541 • Published Jan 16 • 36

visual generation

Scalable Pre-training of Large Autoregressive Image Models

Paper • 2401.08541 • Published Jan 16 • 36

Scalable Pre-training of Large Autoregressive Image Models

Paper • 2401.08541 • Published Jan 16 • 36

Scalable Pre-training of Large Autoregressive Image Models

Paper • 2401.08541 • Published Jan 16 • 36
MAGID: An Automated Pipeline for Generating Synthetic Multi-modal Datasets

Paper • 2403.03194 • Published Mar 5 • 12

Vision Foundation Models

aMUSEd: An Open MUSE Reproduction

Paper • 2401.01808 • Published Jan 3 • 28
PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models

Paper • 2401.05252 • Published Jan 10 • 47
Scalable Pre-training of Large Autoregressive Image Models

Paper • 2401.08541 • Published Jan 16 • 36
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Paper • 2401.09417 • Published Jan 17 • 59

paper to review

VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence

Paper • 2312.02087 • Published Dec 4, 2023 • 20
FaceStudio: Put Your Face Everywhere in Seconds

Paper • 2312.02663 • Published Dec 5, 2023 • 30
Orthogonal Adaptation for Modular Customization of Diffusion Models

Paper • 2312.02432 • Published Dec 5, 2023 • 12
ReconFusion: 3D Reconstruction with Diffusion Priors

Paper • 2312.02981 • Published Dec 5, 2023 • 8

Language Modeling Is Compression

Paper • 2309.10668 • Published Sep 19, 2023 • 82
Scalable Pre-training of Large Autoregressive Image Models

Paper • 2401.08541 • Published Jan 16 • 36

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs