Update README.md
Browse files
README.md
CHANGED
@@ -1,50 +1,118 @@
|
|
1 |
---
|
2 |
-
base_model:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
library_name: transformers
|
4 |
tags:
|
5 |
- mergekit
|
6 |
- merge
|
7 |
|
8 |
---
|
9 |
-
|
|
|
10 |
|
11 |
-
|
12 |
|
13 |
-
|
14 |
-
### Merge Method
|
15 |
|
16 |
-
|
17 |
|
18 |
-
|
19 |
|
20 |
-
|
21 |
-
* merge/formaxext.3.1
|
22 |
|
23 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
24 |
|
25 |
-
|
26 |
|
27 |
```yaml
|
28 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
29 |
dtype: float32
|
30 |
-
merge_method: breadcrumbs_ties
|
31 |
out_dtype: bfloat16
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
32 |
parameters:
|
33 |
-
|
34 |
-
|
35 |
-
|
36 |
-
|
37 |
-
|
38 |
-
|
|
|
|
|
|
|
|
|
39 |
parameters:
|
|
|
40 |
density: 0.9
|
41 |
gamma: 0.01
|
42 |
-
|
43 |
-
- layer_range: [0, 32]
|
44 |
-
model: merge/formaxext.3.1
|
45 |
parameters:
|
|
|
46 |
density: 0.9
|
47 |
gamma: 0.01
|
48 |
-
|
49 |
tokenizer_source: union
|
50 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
base_model:
|
3 |
+
- ArliAI/ArliAI-Llama-3-8B-Formax-v1.0
|
4 |
+
- gradientai/Llama-3-8B-Instruct-Gradient-1048k
|
5 |
+
- kromcomp/L3-Ceto-Epith-Humanity.A-v0.1-8B
|
6 |
+
- ghost-x/ghost-8b-beta-1608
|
7 |
+
- kromcomp/L3-Ceto-Epith-Humanity-v0.1-8B
|
8 |
+
- tohur/natsumura-storytelling-rp-1.0-llama-3.1-8b
|
9 |
+
- ArliAI/Llama-3.1-8B-ArliAI-RPMax-v1.1
|
10 |
+
- crestf411/L3.1-8B-sunfall-v0.6.1-dpo
|
11 |
+
- ArliAI/Llama-3.1-8B-ArliAI-Formax-v1.0
|
12 |
library_name: transformers
|
13 |
tags:
|
14 |
- mergekit
|
15 |
- merge
|
16 |
|
17 |
---
|
18 |
+
More experiments that actually work LMAO. Started straying away from Siithamo, at least model list wise. Stheno is just so chatty, idk how to tame it yet. Used components from my upcoming fatboy model as parts of this merge and imo, this is a hidden gem.
|
19 |
+
### Quants
|
20 |
|
21 |
+
[OG Q8 GGUF](https://huggingface.co/kromquant/L3.1-Blazed-Vulca-v0.1c-8B-GGUFs) by me.
|
22 |
|
23 |
+
### Model Details & Recommended Settings
|
|
|
24 |
|
25 |
+
(Still testing; details subject to change)
|
26 |
|
27 |
+
Follows instructs fairly well, doesn't stray much unless the temp is too high. Same thing as all the other model I make with Formax (ty ArliAI), this merge will reflect the character card quality; shit card will have shit output and vise versa.
|
28 |
|
29 |
+
Generates slightly flowery text, thought process type writing. Human-ish dialogue. Chatty but not too chatty, will mimic previous text examples. Coherent up to 16k (as tested).
|
|
|
30 |
|
31 |
+
Rec. Settings:
|
32 |
+
```
|
33 |
+
Template: L3
|
34 |
+
Temperature: 1.3
|
35 |
+
Min P: 0.1
|
36 |
+
Repeat Penalty: 1.05
|
37 |
+
Repeat Penalty Tokens: 256-512 #stick closer to 256
|
38 |
+
```
|
39 |
+
|
40 |
+
### Merge Theory
|
41 |
+
|
42 |
+
Will update later, too tired rn.
|
43 |
|
44 |
+
### Config
|
45 |
|
46 |
```yaml
|
47 |
+
models:
|
48 |
+
- model: ArliAI/ArliAI-Llama-3-8B-Formax-v1.0
|
49 |
+
parameters:
|
50 |
+
weight: [1, 1, 1, 1, 0, 0, 0, 0]
|
51 |
+
- model: gradientai/Llama-3-8B-Instruct-Gradient-1048k
|
52 |
+
parameters:
|
53 |
+
weight: [0, 0, 0, 0, 1, 1, 1, 1]
|
54 |
+
base_model: ArliAI/Llama-3.1-8B-ArliAI-Formax-v1.0
|
55 |
+
parameters:
|
56 |
+
normalize: false
|
57 |
+
int8_mask: true
|
58 |
+
merge_method: dare_linear
|
59 |
dtype: float32
|
|
|
60 |
out_dtype: bfloat16
|
61 |
+
tokenizer_source: base
|
62 |
+
name: formaxext.3.1
|
63 |
+
---
|
64 |
+
models:
|
65 |
+
- model: kromcomp/L3-Ceto-Epith-Humanity.A-v0.1-8B
|
66 |
+
- model: ghost-x/ghost-8b-beta-1608
|
67 |
+
- model: kromcomp/L3-Ceto-Epith-Humanity-v0.1-8B
|
68 |
+
base_model: tohur/natsumura-storytelling-rp-1.0-llama-3.1-8b
|
69 |
+
parameters:
|
70 |
+
normalize: false
|
71 |
+
int8_mask: true
|
72 |
+
merge_method: model_stock
|
73 |
+
dtype: float32
|
74 |
+
out_dtype: bfloat16
|
75 |
+
tokenizer_source: base
|
76 |
+
name: humplus
|
77 |
+
---
|
78 |
+
models:
|
79 |
+
- model: humplus
|
80 |
+
parameters:
|
81 |
+
weight: [0.01, 0.53, 0.9]
|
82 |
+
- model: ArliAI/Llama-3.1-8B-ArliAI-RPMax-v1.1
|
83 |
+
parameters:
|
84 |
+
weight: [0.55, 0.29, 0.1]
|
85 |
+
- model: crestf411/L3.1-8B-sunfall-v0.6.1-dpo
|
86 |
+
parameters:
|
87 |
+
weight: [0.54, 0.28, 0.1]
|
88 |
+
base_model: humplus
|
89 |
parameters:
|
90 |
+
normalize: false
|
91 |
+
int8_mask: true
|
92 |
+
merge_method: dare_linear
|
93 |
+
dtype: float32
|
94 |
+
out_dtype: bfloat16
|
95 |
+
tokenizer_source: base
|
96 |
+
name: tusl3.1
|
97 |
+
---
|
98 |
+
models:
|
99 |
+
- model: tusl3.1
|
100 |
parameters:
|
101 |
+
weight: [0.5, 0.75, 0.8, 0.9, 0.95]
|
102 |
density: 0.9
|
103 |
gamma: 0.01
|
104 |
+
- model: formaxext.3.1
|
|
|
|
|
105 |
parameters:
|
106 |
+
weight: [0.5, 0.25, 0.2, 0.1, 0.05]
|
107 |
density: 0.9
|
108 |
gamma: 0.01
|
109 |
+
base_model: tusl3.1
|
110 |
tokenizer_source: union
|
111 |
+
parameters:
|
112 |
+
normalize: false
|
113 |
+
int8_mask: true
|
114 |
+
merge_method: breadcrumbs_ties
|
115 |
+
dtype: float32
|
116 |
+
out_dtype: bfloat16
|
117 |
+
name: mantusl3.1
|
118 |
+
```
|