Collections
Discover the best community collections!
Collections including paper arxiv:2404.08197
-
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders
Paper • 2404.05961 • Published • 64 -
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
Paper • 2404.07143 • Published • 103 -
Scaling (Down) CLIP: A Comprehensive Analysis of Data, Architecture, and Training Strategies
Paper • 2404.08197 • Published • 27 -
Pre-training Small Base LMs with Fewer Tokens
Paper • 2404.08634 • Published • 34
-
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
Paper • 2404.05014 • Published • 53 -
Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model
Paper • 2404.09967 • Published • 20 -
Scaling (Down) CLIP: A Comprehensive Analysis of Data, Architecture, and Training Strategies
Paper • 2404.08197 • Published • 27
-
No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance
Paper • 2404.04125 • Published • 27 -
Scaling (Down) CLIP: A Comprehensive Analysis of Data, Architecture, and Training Strategies
Paper • 2404.08197 • Published • 27 -
Probing the 3D Awareness of Visual Foundation Models
Paper • 2404.08636 • Published • 12 -
AM-RADIO: Agglomerative Model -- Reduce All Domains Into One
Paper • 2312.06709 • Published • 1
-
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions
Paper • 2403.19651 • Published • 23 -
No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance
Paper • 2404.04125 • Published • 27 -
Scaling (Down) CLIP: A Comprehensive Analysis of Data, Architecture, and Training Strategies
Paper • 2404.08197 • Published • 27 -
Gecko: Versatile Text Embeddings Distilled from Large Language Models
Paper • 2403.20327 • Published • 47
-
World Model on Million-Length Video And Language With RingAttention
Paper • 2402.08268 • Published • 36 -
Improving Text Embeddings with Large Language Models
Paper • 2401.00368 • Published • 79 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 99 -
FiT: Flexible Vision Transformer for Diffusion Model
Paper • 2402.12376 • Published • 48