Can't run the model

by MohamedRashad - opened Apr 4, 2023

Apr 4, 2023

I use this command for inference

CUDA_VISIBLE_DEVICES=0 python llama_inference.py elinas/vicuna-13b-4bit --wbits 4 --groupsize 128 --load vicuna-13b-4bit/vicuna-13b-4bit-128g.safetensors --text "this is llama"

and it give me an error in loading state dict

elinas

Owner Apr 4, 2023

See this in the README and ensure you're on the stable commit. https://huggingface.co/elinas/vicuna-13b-4bit#update-2023-04-03

nealchandra

Apr 4, 2023

With the stable commit indicated in the readme, I'm able to load the model fine but inference still fails with the following error:

TypeError: vecquant4matmul(): incompatible function arguments. The following argument types are supported:
    1. (arg0: at::Tensor, arg1: at::Tensor, arg2: at::Tensor, arg3: at::Tensor, arg4: at::Tensor) -> None

Any ideas? Is there a particular commit of transformers you are pinned to?

elinas

Owner Apr 4, 2023

This does not support llama.cpp, this is for GPTQ via CUDA (or triton).

nealchandra

Apr 4, 2023

•

edited Apr 5, 2023

Right, I understand -- this is running on an A5000 in the cloud. Perhaps it's not using the correct device, I will investigate a bit further.

EDIT -- was able to get it working, needed to rerun setup_cuda

MohamedRashad

Apr 4, 2023

fatal: reference is not a tree: a6f363e3f93b9fb5c26064b5ac7ed58d22e3f773

I am getting reference error when checkingout to the commit specified

elinas

Owner Apr 4, 2023

I have not seen that error format before so I assumed you were using the Python llama.cpp wrapper.

elinas

Owner Apr 4, 2023

I am getting reference error when checkingout to the commit specified

Paste the content of the topmost entry from the command git log

nealchandra

Apr 4, 2023

fatal: reference is not a tree: a6f363e3f93b9fb5c26064b5ac7ed58d22e3f773
I am getting reference error when checkingout to the commit specified

I think you just need to do git fetch origin a6f363e3f93b9fb5c26064b5ac7ed58d22e3f773

MohamedRashad

Apr 5, 2023

I was able to fix the quant problem now it gives me this error:

Missing key(s) in state_dict

elinas

Owner Apr 5, 2023

@nealchandra Thanks, I saw that commit is no longer in the base repo (I haven't fetched so I can still checkout that branch). I have updated the instructions to just use the fork.

elinas

Owner Apr 5, 2023

•

edited Apr 5, 2023

I was able to fix the quant problem now it gives me this error:
Missing key(s) in state_dict

Make sure you have followed the above steps such as running python setup_cuda.py install and having all of the requirements.txt installed.

https://github.com/oobabooga/GPTQ-for-LLaMa#installation

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment