VLM-RLAIF: Tuning Large Multimodal Models for Videos using Reinforcement Learning from AI Feedback

Model Summary

This Hub repository contains a HuggingFace's transformers implementation of VLM-RLAIF model of SNUMPR lab.

VLM-RLAIF-7b [HF]: 7B RLAIF model

Downloads last month: 272

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using SNUMPR/vlm_rlaif_video_llava_7b 1

Collection including SNUMPR/vlm_rlaif_video_llava_7b

VLM-RLAIF

Collection

Respository for ACL 2024 paper "Tuning Large Multimodal Models for Videos using Reinforcement Learning from AI feedback" • 10 items • Updated Aug 6