size mismatch
Hi. I recently tried to import this model as the LLM of LLaVA-NeXT in order to finetune it. However, I constantly receive size mismatch issue such as:
size mismatch for model.layers.31.self_attn.q_proj.weight: copying a param with shape torch.Size([4096, 3072]) from checkpoint, the shape in current model is torch.Size([3072, 3072]).
size mismatch for model.layers.31.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([768, 3072]).
size mismatch for model.layers.31.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([768, 3072]).
size mismatch for model.layers.31.self_attn.o_proj.weight: copying a param with shape torch.Size([3072, 4096]) from checkpoint, the shape in current model is torch.Size([3072, 3072]).
So I tried to simply load the model and the tokenizer by it with this simple code:
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("nvidia/Llama-3.1-Minitron-4B-Width-Base")
model = AutoModel.from_pretrained("nvidia/Llama-3.1-Minitron-4B-Width-Base")
print("Tokenizer vocab size:", tokenizer.vocab_size)
print("Model embedding size:", model.config.vocab_size)
but I still get the same error messages.
Is there a mismatch between the model and my virtual environment or something? I will provide you more information if you need one.
Thank you.