'process_video() got an unexpected keyword argument 'va'‘

#1
by fragrantly - opened

I downloaded the model weights and set up the environment, but I keep getting the error::
Traceback (most recent call last):
File "av_inference.py", line 66, in
inference(args)
File "av_inference.py", line 26, in inference
audio_video_tensor = preprocess(audio_video_path, va=True if args.modal_type == "av" else False)
TypeError: process_video() got an unexpected keyword argument 'va'
I've made some attempts on my own and found that the model still cannot process audio.

Language Technology Lab at Alibaba DAMO Academy org

Thanks for your attention! You can switch to the audio_visual branch (https://github.com/DAMO-NLP-SG/VideoLLaMA2/tree/audio_visual) and clone the repository to run inference for audio_visual related tasks.

Thanks for your attention! You can switch to the audio_visual branch (https://github.com/DAMO-NLP-SG/VideoLLaMA2/tree/audio_visual) and clone the repository to run inference for audio_visual related tasks.

I cloned the audio-related branch, but the processor still can't retrieve the audio information, resulting in an error:
Traceback (most recent call last):
File "inference.py", line 68, in
inference(args)
File "inference.py", line 32, in inference
preprocess = processor['audio' if args.modal_type == "a" else "video"]
KeyError: 'audio'

Language Technology Lab at Alibaba DAMO Academy org

Please check whether you have followed the following steps and whether the audio code is included in the code. In addition, you can provide more screenshots of the file so that I can locate the problem.

git clone https://github.com/DAMO-NLP-SG/VideoLLaMA2
cd VideoLLaMA2
git checkout audio_visual
pip install -r requirements.txt
pip install flash-attn==2.5.8 --no-build-isolation
pip install opencv-python==4.5.5.64
apt-get update && apt-get install ffmpeg libsm6 libxext6 -y

image.png

Sign up or log in to comment