jan-hq commited on
Commit
f914883
1 Parent(s): d4052a1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -6
README.md CHANGED
@@ -30,22 +30,31 @@ The yaml config file for this model is here:
30
  slices:
31
  - sources:
32
  - model: viethq188/LeoScorpius-7B-Chat-DPO
33
-
34
- layer_range: [0, 32]
35
  - model: GreenNode/GreenNodeLM-7B-v1olet
36
  layer_range: [0, 32]
37
  merge_method: slerp
38
  base_model: GreenNode/GreenNodeLM-7B-v1olet
39
  parameters:
40
  t:
 
 
 
 
41
  - filter: self_attn
42
- value: [0, 0.5, 0.3, 0.7, 1]
43
  - filter: mlp
44
- value: [1, 0.5, 0.7, 0.3, 0]
45
- - value: 0.5
 
 
 
 
46
  dtype: bfloat16
47
  ```
48
 
 
 
49
  # Prompt template
50
 
51
  - **ChatML**
@@ -93,7 +102,8 @@ Detailed results can be found here.
93
  | GSM8K (5-shot) | ? |
94
 
95
  # Acknowlegement
96
- - [mergekit](https://github.com/cg123/mergekit)
 
97
  - [DARE](https://github.com/yule-BUAA/MergeLM/blob/main/README.md)
98
  -
99
  [SLERP](https://github.com/Digitous/LLM-SLERP-Merge)
 
30
  slices:
31
  - sources:
32
  - model: viethq188/LeoScorpius-7B-Chat-DPO
33
+ layer_range: [0, 32]
 
34
  - model: GreenNode/GreenNodeLM-7B-v1olet
35
  layer_range: [0, 32]
36
  merge_method: slerp
37
  base_model: GreenNode/GreenNodeLM-7B-v1olet
38
  parameters:
39
  t:
40
+ - filter: lm_head
41
+ value: [0.55]
42
+ - filter: embed_tokens
43
+ value: [0.7]
44
  - filter: self_attn
45
+ value: [0.65, 0.35]
46
  - filter: mlp
47
+ value: [0.35, 0.65]
48
+ - filter: layernorm
49
+ value: [0.4, 0.6]
50
+ - filter: modelnorm
51
+ value: [0.6]
52
+ - value: 0.5 # fallback for rest of tensors
53
  dtype: bfloat16
54
  ```
55
 
56
+ Thank you [Undi95](https://huggingface.co/Undi95) for the secret sauce and (Charles Goddard)[https://huggingface.co/chargoddard] for mergekit.
57
+
58
  # Prompt template
59
 
60
  - **ChatML**
 
102
  | GSM8K (5-shot) | ? |
103
 
104
  # Acknowlegement
105
+ - [mergekit](https://github.com/cg123/mergekit
106
+ )
107
  - [DARE](https://github.com/yule-BUAA/MergeLM/blob/main/README.md)
108
  -
109
  [SLERP](https://github.com/Digitous/LLM-SLERP-Merge)