Michael Hale's picture

123 12

Michael Hale

mhale

·

AI & ML interests

None yet

Organizations

mhale's activity

upvoted a paper 8 days ago

RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal Control

Paper • 2405.17401 • Published May 27 • 5

upvoted 5 papers 20 days ago

Controllable Text Generation for Large Language Models: A Survey

Paper • 2408.12599 • Published 28 days ago • 61

Sapiens: Foundation for Human Vision Models

Paper • 2408.12569 • Published 28 days ago • 84

LayerPano3D: Layered 3D Panorama for Hyper-Immersive Scene Generation

Paper • 2408.13252 • Published 27 days ago • 23

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published 28 days ago • 109

Foundation Models for Music: A Survey

Paper • 2408.14340 • Published 24 days ago • 38

upvoted a paper 22 days ago

Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published 24 days ago • 119

upvoted 14 papers about 1 month ago

ControlNeXt: Powerful and Efficient Control for Image and Video Generation

Paper • 2408.06070 • Published Aug 12 • 52

Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers

Paper • 2408.06195 • Published Aug 12 • 55

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery

Paper • 2408.06292 • Published Aug 12 • 114

LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

Paper • 2408.07055 • Published Aug 13 • 65

VITA: Towards Open-Source Interactive Omni Multimodal LLM

Paper • 2408.05211 • Published Aug 9 • 46

Achieving Human Level Competitive Robot Table Tennis

Paper • 2408.03906 • Published Aug 7 • 26

An Object is Worth 64x64 Pixels: Generating 3D Object via Image Diffusion

Paper • 2408.03178 • Published Aug 6 • 35

IPAdapter-Instruct: Resolving Ambiguity in Image-based Conditioning using Instruct Prompts

Paper • 2408.03209 • Published Aug 6 • 21

MiniCPM-V: A GPT-4V Level MLLM on Your Phone

Paper • 2408.01800 • Published Aug 3 • 74

MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh Tokenization

Paper • 2408.02555 • Published Aug 5 • 28

SAM 2: Segment Anything in Images and Videos

Paper • 2408.00714 • Published Aug 1 • 103

Gemma 2: Improving Open Language Models at a Practical Size

Paper • 2408.00118 • Published Jul 31 • 73

SF3D: Stable Fast 3D Mesh Reconstruction with UV-unwrapping and Illumination Disentanglement

Paper • 2408.00653 • Published Aug 1 • 27

The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31 • 102

upvoted 2 papers 2 months ago

Qwen2 Technical Report

Paper • 2407.10671 • Published Jul 15 • 153

Qwen2-Audio Technical Report

Paper • 2407.10759 • Published Jul 15 • 52

upvoted 2 papers 3 months ago

Instruction Pre-Training: Language Models are Supervised Multitask Learners

Paper • 2406.14491 • Published Jun 20 • 85

The Prompt Report: A Systematic Survey of Prompting Techniques

Paper • 2406.06608 • Published Jun 6 • 52

upvoted 6 papers 4 months ago

Transformers Can Do Arithmetic with the Right Embeddings

Paper • 2405.17399 • Published May 27 • 51

Phased Consistency Model

Paper • 2405.18407 • Published May 28 • 46

Your Transformer is Secretly Linear

Paper • 2405.12250 • Published May 19 • 149

Chameleon: Mixed-Modal Early-Fusion Foundation Models

Paper • 2405.09818 • Published May 16 • 125

CAT3D: Create Anything in 3D with Multi-View Diffusion Models

Paper • 2405.10314 • Published May 16 • 43

Coin3D: Controllable and Interactive 3D Assets Generation with Proxy-Guided Conditioning

Paper • 2405.08054 • Published May 13 • 21

upvoted 9 papers 5 months ago

Extending Llama-3's Context Ten-Fold Overnight

Paper • 2404.19553 • Published Apr 30 • 32

MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model

Paper • 2404.19759 • Published Apr 30 • 24

Interactive3D: Create What You Want by Interactive 3D Generation

Paper • 2404.16510 • Published Apr 25 • 18

How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study

Paper • 2404.14047 • Published Apr 22 • 43

Video2Game: Real-time, Interactive, Realistic and Browser-Compatible Environment from a Single Video

Paper • 2404.09833 • Published Apr 15 • 29

Rho-1: Not All Tokens Are What You Need

Paper • 2404.07965 • Published Apr 11 • 83

ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback

Paper • 2404.07987 • Published Apr 11 • 47

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2 • 103

LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders

Paper • 2404.05961 • Published Apr 9 • 63

upvoted 4 papers 6 months ago

StarCoder 2 and The Stack v2: The Next Generation

Paper • 2402.19173 • Published Feb 29 • 132

Design2Code: How Far Are We From Automating Front-End Engineering?

Paper • 2403.03163 • Published Mar 5 • 93

Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation

Paper • 2403.12015 • Published Mar 18 • 63

StableDrag: Stable Dragging for Point-based Image Editing

Paper • 2403.04437 • Published Mar 7 • 25

upvoted 2 papers 7 months ago

Genie: Generative Interactive Environments

Paper • 2402.15391 • Published Feb 23 • 70

A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts

Paper • 2402.09727 • Published Feb 15 • 35

upvoted 7 papers 8 months ago

InstantID: Zero-shot Identity-Preserving Generation in Seconds

Paper • 2401.07519 • Published Jan 15 • 51

InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model

Paper • 2401.16420 • Published Jan 29 • 54

Diffuse to Choose: Enriching Image Conditioned Inpainting in Latent Diffusion Models for Virtual Try-All

Paper • 2401.13795 • Published Jan 24 • 64

Lumiere: A Space-Time Diffusion Model for Video Generation

Paper • 2401.12945 • Published Jan 23 • 86

Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment

Paper • 2401.12474 • Published Jan 23 • 33

Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Paper • 2401.10891 • Published Jan 19 • 58

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18 • 140

upvoted a paper 9 months ago

Style Aligned Image Generation via Shared Attention

Paper • 2312.02133 • Published Dec 4, 2023 • 8

upvoted 6 papers 10 months ago

ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs

Paper • 2311.13600 • Published Nov 22, 2023 • 41

PhysGaussian: Physics-Integrated 3D Gaussians for Generative Dynamics

Paper • 2311.12198 • Published Nov 20, 2023 • 22

SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering

Paper • 2311.12775 • Published Nov 21, 2023 • 28

Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning

Paper • 2311.10709 • Published Nov 17, 2023 • 24

Diffusion Model Alignment Using Direct Preference Optimization

Paper • 2311.12908 • Published Nov 21, 2023 • 47

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Paper • 2311.10122 • Published Nov 16, 2023 • 26