AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks? Paper • 2407.15711 • Published Jul 22 • 9
WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models Paper • 2401.13919 • Published Jan 25 • 23
view article Article Enjoy the Power of Phi-3 with ONNX Runtime on your device By Emma-N • May 22 • 24
InternVL 1.0 Collection Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks • 16 items • Updated Jun 27 • 15
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 27 items • Updated 1 day ago • 460
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time Paper • 2404.10667 • Published Apr 16 • 15
Vector-io compatible Datasets Collection These datasets can be loaded into your vector database with a single line bash command • 15 items • Updated about 23 hours ago • 3