llava_mistral merging

#1
by jeiku - opened

Hi, I'm attempting to replicate your merge recipe as proof of concept for a pure llava mistral merge, but when I run your config in the linked LazyMergekit notebook, I receive the following error:

ValueError: The checkpoint you are trying to load has model type `llava_mistral` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

Did you do anything special to get the merge to complete successfully?

Thanks for your time!

try changing architectures in config.json to MistralForCausalLM, after the merging, change it back to llava_mistral

megaaziib changed discussion status to closed

I knew it would be something as simple as that, but I was afraid to try it. Thank you so much!

Sign up or log in to comment