Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs Paper ā¢ 2404.05719 ā¢ Published Apr 8 ā¢ 80
Harnessing Webpage UIs for Text-Rich Visual Understanding Paper ā¢ 2410.13824 ā¢ Published 20 days ago ā¢ 29
DocLayout-YOLO Collection Dataset and model for DocLayout-YOLO ā¢ 9 items ā¢ Updated 16 days ago ā¢ 11
Flex3D: Feed-Forward 3D Generation With Flexible Reconstruction Model And Input View Curation Paper ā¢ 2410.00890 ā¢ Published Oct 1 ā¢ 17