Which Mergekit did you use for this?
#1
by
softwareweaver
- opened
Which Mergekit did you use for this? The standard one did not work. Thanks.
That's completely correct.
Mergekit has two issues to merge CohereForAI/c4ai-command-r-plus
.
- The layers added in
c4ai-command-r-plus
is not supported. - The
lm_head
section on cohere.json causes an unsupported model in llama.cpp.
So I wrote a patch for these issues:
--- a/mergekit/_data/architectures/cohere.json
+++ b/mergekit/_data/architectures/cohere.json
@@ -12,13 +12,6 @@
"post_weights": [
{
"name": "model.norm.weight"
- },
- {
- "name": "lm_head.weight",
- "is_embed": true,
- "aliases": [
- "model.embed_tokens.weight"
- ]
}
],
"num_layers_config_key": "num_hidden_layers",
@@ -36,9 +29,15 @@
{
"name": "model.layers.${layer_index}.mlp.up_proj.weight"
},
+ {
+ "name": "model.layers.${layer_index}.self_attn.q_norm.weight"
+ },
{
"name": "model.layers.${layer_index}.self_attn.q_proj.weight"
},
+ {
+ "name": "model.layers.${layer_index}.self_attn.k_norm.weight"
+ },
{
"name": "model.layers.${layer_index}.self_attn.k_proj.weight"
},
This is a hack, but it works fine for c4ai-command-r-plus self-merging.
Thanks. I will try that. I was thinking of merging command-r with softwareweaver/Twilight-Miqu-146B
nitky
changed discussion status to
closed