kromeurus commited on
Commit
2d832f6
1 Parent(s): e2e425d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +92 -24
README.md CHANGED
@@ -1,50 +1,118 @@
1
  ---
2
- base_model: []
 
 
 
 
 
 
 
 
 
3
  library_name: transformers
4
  tags:
5
  - mergekit
6
  - merge
7
 
8
  ---
9
- # mantusl3.1
 
10
 
11
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
12
 
13
- ## Merge Details
14
- ### Merge Method
15
 
16
- This model was merged using the breadcrumbs_ties merge method using merge/tusl3.1 as a base.
17
 
18
- ### Models Merged
19
 
20
- The following models were included in the merge:
21
- * merge/formaxext.3.1
22
 
23
- ### Configuration
 
 
 
 
 
 
 
 
 
 
 
24
 
25
- The following YAML configuration was used to produce this model:
26
 
27
  ```yaml
28
- base_model: merge/tusl3.1
 
 
 
 
 
 
 
 
 
 
 
29
  dtype: float32
30
- merge_method: breadcrumbs_ties
31
  out_dtype: bfloat16
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
  parameters:
33
- int8_mask: 1.0
34
- normalize: 0.0
35
- slices:
36
- - sources:
37
- - layer_range: [0, 32]
38
- model: merge/tusl3.1
 
 
 
 
39
  parameters:
 
40
  density: 0.9
41
  gamma: 0.01
42
- weight: [0.5, 0.75, 0.8, 0.9, 0.95]
43
- - layer_range: [0, 32]
44
- model: merge/formaxext.3.1
45
  parameters:
 
46
  density: 0.9
47
  gamma: 0.01
48
- weight: [0.5, 0.25, 0.2, 0.1, 0.05]
49
  tokenizer_source: union
50
- ```
 
 
 
 
 
 
 
 
1
  ---
2
+ base_model:
3
+ - ArliAI/ArliAI-Llama-3-8B-Formax-v1.0
4
+ - gradientai/Llama-3-8B-Instruct-Gradient-1048k
5
+ - kromcomp/L3-Ceto-Epith-Humanity.A-v0.1-8B
6
+ - ghost-x/ghost-8b-beta-1608
7
+ - kromcomp/L3-Ceto-Epith-Humanity-v0.1-8B
8
+ - tohur/natsumura-storytelling-rp-1.0-llama-3.1-8b
9
+ - ArliAI/Llama-3.1-8B-ArliAI-RPMax-v1.1
10
+ - crestf411/L3.1-8B-sunfall-v0.6.1-dpo
11
+ - ArliAI/Llama-3.1-8B-ArliAI-Formax-v1.0
12
  library_name: transformers
13
  tags:
14
  - mergekit
15
  - merge
16
 
17
  ---
18
+ More experiments that actually work LMAO. Started straying away from Siithamo, at least model list wise. Stheno is just so chatty, idk how to tame it yet. Used components from my upcoming fatboy model as parts of this merge and imo, this is a hidden gem.
19
+ ### Quants
20
 
21
+ [OG Q8 GGUF](https://huggingface.co/kromquant/L3.1-Blazed-Vulca-v0.1c-8B-GGUFs) by me.
22
 
23
+ ### Model Details & Recommended Settings
 
24
 
25
+ (Still testing; details subject to change)
26
 
27
+ Follows instructs fairly well, doesn't stray much unless the temp is too high. Same thing as all the other model I make with Formax (ty ArliAI), this merge will reflect the character card quality; shit card will have shit output and vise versa.
28
 
29
+ Generates slightly flowery text, thought process type writing. Human-ish dialogue. Chatty but not too chatty, will mimic previous text examples. Coherent up to 16k (as tested).
 
30
 
31
+ Rec. Settings:
32
+ ```
33
+ Template: L3
34
+ Temperature: 1.3
35
+ Min P: 0.1
36
+ Repeat Penalty: 1.05
37
+ Repeat Penalty Tokens: 256-512 #stick closer to 256
38
+ ```
39
+
40
+ ### Merge Theory
41
+
42
+ Will update later, too tired rn.
43
 
44
+ ### Config
45
 
46
  ```yaml
47
+ models:
48
+ - model: ArliAI/ArliAI-Llama-3-8B-Formax-v1.0
49
+ parameters:
50
+ weight: [1, 1, 1, 1, 0, 0, 0, 0]
51
+ - model: gradientai/Llama-3-8B-Instruct-Gradient-1048k
52
+ parameters:
53
+ weight: [0, 0, 0, 0, 1, 1, 1, 1]
54
+ base_model: ArliAI/Llama-3.1-8B-ArliAI-Formax-v1.0
55
+ parameters:
56
+ normalize: false
57
+ int8_mask: true
58
+ merge_method: dare_linear
59
  dtype: float32
 
60
  out_dtype: bfloat16
61
+ tokenizer_source: base
62
+ name: formaxext.3.1
63
+ ---
64
+ models:
65
+ - model: kromcomp/L3-Ceto-Epith-Humanity.A-v0.1-8B
66
+ - model: ghost-x/ghost-8b-beta-1608
67
+ - model: kromcomp/L3-Ceto-Epith-Humanity-v0.1-8B
68
+ base_model: tohur/natsumura-storytelling-rp-1.0-llama-3.1-8b
69
+ parameters:
70
+ normalize: false
71
+ int8_mask: true
72
+ merge_method: model_stock
73
+ dtype: float32
74
+ out_dtype: bfloat16
75
+ tokenizer_source: base
76
+ name: humplus
77
+ ---
78
+ models:
79
+ - model: humplus
80
+ parameters:
81
+ weight: [0.01, 0.53, 0.9]
82
+ - model: ArliAI/Llama-3.1-8B-ArliAI-RPMax-v1.1
83
+ parameters:
84
+ weight: [0.55, 0.29, 0.1]
85
+ - model: crestf411/L3.1-8B-sunfall-v0.6.1-dpo
86
+ parameters:
87
+ weight: [0.54, 0.28, 0.1]
88
+ base_model: humplus
89
  parameters:
90
+ normalize: false
91
+ int8_mask: true
92
+ merge_method: dare_linear
93
+ dtype: float32
94
+ out_dtype: bfloat16
95
+ tokenizer_source: base
96
+ name: tusl3.1
97
+ ---
98
+ models:
99
+ - model: tusl3.1
100
  parameters:
101
+ weight: [0.5, 0.75, 0.8, 0.9, 0.95]
102
  density: 0.9
103
  gamma: 0.01
104
+ - model: formaxext.3.1
 
 
105
  parameters:
106
+ weight: [0.5, 0.25, 0.2, 0.1, 0.05]
107
  density: 0.9
108
  gamma: 0.01
109
+ base_model: tusl3.1
110
  tokenizer_source: union
111
+ parameters:
112
+ normalize: false
113
+ int8_mask: true
114
+ merge_method: breadcrumbs_ties
115
+ dtype: float32
116
+ out_dtype: bfloat16
117
+ name: mantusl3.1
118
+ ```