Crashes when I try to load it in koboldcpp

#1
by lemon07r - opened

Basically the title, same issue with bartowskis quants too. Maybe koboldcpp doesnt have the required upstream merges from llamacpp yet? Wondering if someone can confirm. I tested lost ruins koboldcpp with openblas and vulkan both, and yellowroses hipblas fork, neither can load this model. Tested with Q4k_M

Yes, we'll have to wait a bit until KoboldCPP updates this change:

https://github.com/LostRuins/koboldcpp/commit/889bdd76866ea31a7625ec2dcea63ff469f3e981

@Elfrino I compiled llama.cpp with the latest code. The error persists.

@Elfrino I compiled llama.cpp with the latest code. The error persists.

Hmm. The same error I presume or a different one?

@Elfrino I compiled llama.cpp with the latest code. The error persists.

Hmm. The same error I presume or a different one?

My error is check_tensor_dims: tensor 'token_embd.weight' has wrong shape

🤔Did you load successfully?

@Elfrino I compiled llama.cpp with the latest code. The error persists.

Hmm. The same error I presume or a different one?

My error is check_tensor_dims: tensor 'token_embd.weight' has wrong shape

🤔Did you load successfully?

I just tried it with the new KoboldCPP just released, works fine :)

https://github.com/LostRuins/koboldcpp/releases

@Elfrino I compiled llama.cpp with the latest code. The error persists.

Hmm. The same error I presume or a different one?

My error is check_tensor_dims: tensor 'token_embd.weight' has wrong shape

🤔Did you load successfully?

I just tried it with the new KoboldCPP just released, works fine :)

https://github.com/LostRuins/koboldcpp/releases

WoW! I know what happens.

GGUFs in bartowski/35b-beta-long-GGUF can be loaded correctly.

GGUFs in this repo will be mistakenly recognized as type llama(should be command-r).

@Elfrino I compiled llama.cpp with the latest code. The error persists.

Hmm. The same error I presume or a different one?

My error is check_tensor_dims: tensor 'token_embd.weight' has wrong shape

🤔Did you load successfully?

I just tried it with the new KoboldCPP just released, works fine :)

https://github.com/LostRuins/koboldcpp/releases

WoW! I know what happens.

GGUFs in bartowski/35b-beta-long-GGUF can be loaded correctly.

GGUFs in this repo will be mistakenly recognized as type llama(should be command-r).

You mean it works with the new version of KoboldCPP? Or does it work with the older version too?

@Elfrino I compiled llama.cpp with the latest code. The error persists.

Hmm. The same error I presume or a different one?

My error is check_tensor_dims: tensor 'token_embd.weight' has wrong shape

🤔Did you load successfully?

I just tried it with the new KoboldCPP just released, works fine :)

https://github.com/LostRuins/koboldcpp/releases

WoW! I know what happens.

GGUFs in bartowski/35b-beta-long-GGUF can be loaded correctly.

GGUFs in this repo will be mistakenly recognized as type llama(should be command-r).

You mean it works with the new version of KoboldCPP? Or does it work with the older version too?

Only new version of KoboldCPP & llama.cpp work. They support the command-r type which Causallm-35B is based on.

This repo is deprecated as the model should be quantized again. bartowski/35b-beta-long-GGUF is correct.

Sign up or log in to comment