-
3D Congealing: 3D-Aware Image Alignment in the Wild
Paper • 2404.02125 • Published • 7 -
SpatialTracker: Tracking Any 2D Pixels in 3D Space
Paper • 2404.04319 • Published • 23 -
Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization
Paper • 2404.09956 • Published • 11 -
MeshLRM: Large Reconstruction Model for High-Quality Mesh
Paper • 2404.12385 • Published • 26
Collections
Discover the best community collections!
Collections including paper arxiv:2404.09956
-
WavLLM: Towards Robust and Adaptive Speech Large Language Model
Paper • 2404.00656 • Published • 10 -
Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization
Paper • 2404.09956 • Published • 11 -
Long-form music generation with latent diffusion
Paper • 2404.10301 • Published • 24
-
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
Paper • 1712.05884 • Published • 2 -
Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization
Paper • 2404.09956 • Published • 11 -
Music Consistency Models
Paper • 2404.13358 • Published • 12 -
FlashSpeech: Efficient Zero-Shot Speech Synthesis
Paper • 2404.14700 • Published • 29
-
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 48 -
HyperCLOVA X Technical Report
Paper • 2404.01954 • Published • 19 -
Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization
Paper • 2404.09956 • Published • 11 -
Learn Your Reference Model for Real Good Alignment
Paper • 2404.09656 • Published • 82
-
Fine-Tuning Language Models from Human Preferences
Paper • 1909.08593 • Published • 3 -
Transforming and Combining Rewards for Aligning Large Language Models
Paper • 2402.00742 • Published • 11 -
Leverage the Average: an Analysis of KL Regularization in RL
Paper • 2003.14089 • Published • 2 -
Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward
Paper • 2404.01258 • Published • 10
-
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 48 -
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization
Paper • 2402.09320 • Published • 6 -
sDPO: Don't Use Your Data All at Once
Paper • 2403.19270 • Published • 39 -
Dueling RL: Reinforcement Learning with Trajectory Preferences
Paper • 2111.04850 • Published • 2
-
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Paper • 2310.00704 • Published • 19 -
Structural Similarities Between Language Models and Neural Response Measurements
Paper • 2306.01930 • Published • 2 -
Streaming Transformer ASR with Blockwise Synchronous Beam Search
Paper • 2006.14941 • Published • 2 -
NU-GAN: High resolution neural upsampling with GAN
Paper • 2010.11362 • Published • 2
-
U-Net: Convolutional Networks for Biomedical Image Segmentation
Paper • 1505.04597 • Published • 7 -
Image Segmentation using U-Net Architecture for Powder X-ray Diffraction Images
Paper • 2310.16186 • Published • 2 -
H-DenseUNet: Hybrid Densely Connected UNet for Liver and Tumor Segmentation from CT Volumes
Paper • 1709.07330 • Published • 2 -
Deep LOGISMOS: Deep Learning Graph-based 3D Segmentation of Pancreatic Tumors on CT scans
Paper • 1801.08599 • Published • 2
-
WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
Paper • 2401.09985 • Published • 15 -
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects
Paper • 2401.09962 • Published • 8 -
Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution
Paper • 2401.10404 • Published • 10 -
ActAnywhere: Subject-Aware Video Background Generation
Paper • 2401.10822 • Published • 13