ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning Paper • 2411.05003 • Published 5 days ago • 63
Retrieval Head Mechanistically Explains Long-Context Factuality Paper • 2404.15574 • Published Apr 24 • 2
DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucinations Paper • 2410.18860 • Published 19 days ago • 8
Skywork-Reward: Bag of Tricks for Reward Modeling in LLMs Paper • 2410.18451 • Published 20 days ago • 13
Why Does the Effective Context Length of LLMs Fall Short? Paper • 2410.18745 • Published 19 days ago • 16
Distill Visual Chart Reasoning Ability from LLMs to MLLMs Paper • 2410.18798 • Published 19 days ago • 19
Unbounded: A Generative Infinite Game of Character Life Simulation Paper • 2410.18975 • Published 19 days ago • 34
DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control Paper • 2410.13830 • Published 26 days ago • 23
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models Paper • 2410.07133 • Published Oct 9 • 18
Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation Paper • 2410.05363 • Published Oct 7 • 44
Training-free Long Video Generation with Chain of Diffusion Model Experts Paper • 2408.13423 • Published Aug 24 • 20
DragAnything: Motion Control for Anything using Entity Representation Paper • 2403.07420 • Published Mar 12 • 13
VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models Paper • 2403.06098 • Published Mar 10 • 15
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers Paper • 2402.19479 • Published Feb 29 • 32
DiffuseKronA: A Parameter Efficient Fine-tuning Method for Personalized Diffusion Model Paper • 2402.17412 • Published Feb 27 • 21
Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners Paper • 2402.17723 • Published Feb 27 • 16