-
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
Paper • 2311.05437 • Published • 45 -
On the Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous Driving
Paper • 2311.05332 • Published • 9 -
SoundCam: A Dataset for Finding Humans Using Room Acoustics
Paper • 2311.03517 • Published • 10
Chaolei Tan
Chaolei
AI & ML interests
Computer Vision, Multimodal Learning, Video Understanding
Organizations
None yet
Collections
10
models
None public yet
datasets
None public yet