Baifeng Shi's picture

1 13

Baifeng Shi

bfshi

·

https://bfshi.github.io

AI & ML interests

computer vision

Organizations

bfshi's activity

upvoted a paper 13 days ago

Robots Pre-train Robots: Manipulation-Centric Robotic Representation from Large-Scale Robot Dataset

Paper • 2410.22325 • Published 14 days ago • 9

upvoted a paper 21 days ago

SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree

Paper • 2410.16268 • Published 22 days ago • 65

upvoted a paper about 1 month ago

PHI-S: Distribution Balancing for Label-Free Multi-Teacher Distillation

Paper • 2410.01680 • Published Oct 2 • 32

upvoted 3 papers 3 months ago

MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?

Paper • 2408.13257 • Published Aug 23 • 25

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22 • 117

LongVILA: Scaling Long-Context Visual Language Models for Long Videos

Paper • 2408.10188 • Published Aug 19 • 51

upvoted 3 papers 4 months ago

VILA^2: VILA Augmented VILA

Paper • 2407.17453 • Published Jul 24 • 38

VideoGameBunny: Towards vision assistants for video games

Paper • 2407.15295 • Published Jul 21 • 21

Shape of Motion: 4D Reconstruction from a Single Video

Paper • 2407.13764 • Published Jul 18 • 19

upvoted a paper 5 months ago

OpenVLA: An Open-Source Vision-Language-Action Model

Paper • 2406.09246 • Published Jun 13 • 36

upvoted a paper 8 months ago

When Do We Not Need Larger Vision Models?

Paper • 2403.13043 • Published Mar 19 • 25

upvoted a paper 9 months ago

Humanoid Locomotion as Next Token Prediction

Paper • 2402.19469 • Published Feb 29 • 26

upvoted a paper 10 months ago

Rethinking Patch Dependence for Masked Autoencoders

Paper • 2401.14391 • Published Jan 25 • 23