Can't inference
#1
by
Tibbnak
- opened
When trying to inference the Q4_K_M gguf with llama.cpp, latest compiled server.exe
llama_model_load: error loading model: create_tensor: tensor 'blk.0.attn_q.weight' has wrong shape; expected 3072, 3072, got 3072, 4096, 1, 1
llama_load_model_from_file: failed to load model
hmm, getting same issue, let me investigate