RuntimeError: weight model.vision_model.embeddings.position_ids does not exist

by jinyolim - opened Sep 19, 2023

Discussion

jinyolim

Sep 19, 2023

Got this error using the provided SageMaker SDK script. Is this a known bug in TGI 1.0.3?

VictorSanh

Sep 20, 2023

will get to the bottom of this tomorrow, it does seem surprising, thanks for reporting!

rthamman

Sep 21, 2023

@VictorSanh i have the same issue. Please let me know.

VictorSanh

Sep 25, 2023

Ok, I understand the situation now.

In TGI (file idefics_vision.py), we are defining the attribute position_ids as self.position_ids = weights.get_tensor(f"{prefix}.position_ids") which means that during the initialization of the model, we'll look for a tensor called position_ids.

The instruct models have that weight tensor, but not the base ones.

However, in HF Transformers, we are defining position_ids as a registered buffer (file idefics/vision.py: self.register_buffer("position_ids", torch.arange(self.num_positions).expand((1, -1)), persistent=False)) which means that position_ids is automatically registered at initialization.

My suggestion would be to correct the way we initialize position_ids in TGI (I think that's a mistake i made to not use registering buffers). Could you confirm it is the way course of action @Narsil ?

Narsil

HuggingFaceM4 org Sep 26, 2023

This would work, however I don't think it's a great idea.

Either we should always look for them on file, or never. We had a similar thing with Llama and inv_freq. The issue of doing it "sometimes" is that some models might save a different buffer than the one we generate automatically which makes it super hard to debug.

Are those positions ids always arange ? If yes we should just use that and drop the loading part, no ? (And we could probably do the same in transformers)

VictorSanh

Sep 26, 2023

Got it!

Fixed it here: https://github.com/huggingface/text-generation-inference/pull/1064

It uses the same logic as in transformers

VictorSanh

Oct 4, 2023

@jinyolim just want to make sure you saw this: Nicolas pushed the fix on tgi 1.1.0. could you report back on whether you are still seeing the same bug?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment