CogVLM2: Visual Language Models for Image and Video Understanding Paper • 2408.16500 • Published Aug 29 • 56
Building and better understanding vision-language models: insights and future directions Paper • 2408.12637 • Published Aug 22 • 112
Towards a Unified View of Preference Learning for Large Language Models: A Survey Paper • 2409.02795 • Published Sep 4 • 72
Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale Paper • 2409.08264 • Published 25 days ago • 42
Training Language Models to Self-Correct via Reinforcement Learning Paper • 2409.12917 • Published 18 days ago • 128
RACER: Rich Language-Guided Failure Recovery Policies for Imitation Learning Paper • 2409.14674 • Published 15 days ago • 40