Can't inference

by Tibbnak - opened Mar 12

Mar 12

When trying to inference the Q4_K_M gguf with llama.cpp, latest compiled server.exe

llama_model_load: error loading model: create_tensor: tensor 'blk.0.attn_q.weight' has wrong shape; expected 3072, 3072, got 3072, 4096, 1, 1
llama_load_model_from_file: failed to load model

bartowski

Owner Mar 12

hmm, getting same issue, let me investigate

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment