-
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
Paper • 2403.05135 • Published • 42 -
Understanding Diffusion Objectives as the ELBO with Simple Data Augmentation
Paper • 2303.00848 • Published -
Scalable Diffusion Models with Transformers
Paper • 2212.09748 • Published • 16 -
High-Resolution Image Synthesis with Latent Diffusion Models
Paper • 2112.10752 • Published • 11
Collections
Discover the best community collections!
Collections including paper arxiv:2310.17680
-
Design2Code: How Far Are We From Automating Front-End Engineering?
Paper • 2403.03163 • Published • 93 -
Wukong: Towards a Scaling Law for Large-Scale Recommendation
Paper • 2403.02545 • Published • 15 -
StarCoder: may the source be with you!
Paper • 2305.06161 • Published • 29 -
Exploring Parameter-Efficient Fine-Tuning Techniques for Code Generation with Large Language Models
Paper • 2308.10462 • Published • 1
-
AtP*: An efficient and scalable method for localizing LLM behaviour to components
Paper • 2403.00745 • Published • 11 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 602 -
MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT
Paper • 2402.16840 • Published • 23 -
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Paper • 2402.13753 • Published • 111
-
CodeBERT: A Pre-Trained Model for Programming and Natural Languages
Paper • 2002.08155 • Published • 2 -
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement
Paper • 2402.14658 • Published • 82 -
CodeFusion: A Pre-trained Diffusion Model for Code Generation
Paper • 2310.17680 • Published • 69 -
CodePlan: Repository-level Coding using LLMs and Planning
Paper • 2309.12499 • Published • 73
-
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 99 -
How to Train Data-Efficient LLMs
Paper • 2402.09668 • Published • 38 -
BitDelta: Your Fine-Tune May Only Be Worth One Bit
Paper • 2402.10193 • Published • 17 -
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts
Paper • 2402.09727 • Published • 35
-
StarCoder: may the source be with you!
Paper • 2305.06161 • Published • 29 -
WizardCoder: Empowering Code Large Language Models with Evol-Instruct
Paper • 2306.08568 • Published • 28 -
SantaCoder: don't reach for the stars!
Paper • 2301.03988 • Published • 7 -
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Paper • 2401.14196 • Published • 46
-
Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models
Paper • 2312.09608 • Published • 13 -
CodeFusion: A Pre-trained Diffusion Model for Code Generation
Paper • 2310.17680 • Published • 69 -
ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Real Image
Paper • 2310.17994 • Published • 8 -
Progressive Knowledge Distillation Of Stable Diffusion XL Using Layer Level Loss
Paper • 2401.02677 • Published • 21
-
Attention Is All You Need
Paper • 1706.03762 • Published • 44 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 14 -
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Paper • 1907.11692 • Published • 7 -
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Paper • 1910.01108 • Published • 14
-
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 40 -
Data Filtering Networks
Paper • 2309.17425 • Published • 6 -
FlashDecoding++: Faster Large Language Model Inference on GPUs
Paper • 2311.01282 • Published • 35 -
E3 TTS: Easy End-to-End Diffusion-based Text to Speech
Paper • 2311.00945 • Published • 14