File size: 3,641 Bytes
9418fc8
 
 
 
 
 
 
 
9078482
9418fc8
15f4a19
e77f8ef
 
 
 
15f4a19
 
 
c5087e1
15f4a19
2133207
126be80
33c7602
126be80
6e7708d
8f664df
8a12658
07e1f4f
8b7a6dc
07e1f4f
9418fc8
 
 
 
 
 
7e918b2
9418fc8
 
 
 
4665775
 
 
 
 
 
 
 
 
9418fc8
 
 
 
 
 
e7ac0f7
4665775
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e7ac0f7
4665775
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e7ac0f7
9418fc8
 
 
 
 
 
 
 
decbab1
9418fc8
 
 
 
 
 
 
9078482
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
---
base_model:
- v000000/MN-12B-Part1
- v000000/MN-12B-Part2
library_name: transformers
tags:
- mergekit
- merge
- mistral
---

> [!WARNING]
> **Temperature:**<br>
> Mistral Nemo likes low temperature between 0.3-0.5

Mistral-Nemo-12B-Estrella-v1
---------------------------------------------------------------------

![image/png](https://cdn-uploads.huggingface.co/production/uploads/64f74b6e6389380c77562762/MyveknmJhuj43YrukIDAU.png)

RP Model. Seems coherent and concise but also creative. Big merge using new DELLA technique.

<b>Prompt Format: Mistral Instruct / ChatML format.</b>

# <b>Quants</b>
* [Q6_K GGUF](https://huggingface.co/v000000/MN-12B-Estrella-v1-Q6_K-GGUF)


----------------------------------------------------------------------
## merge

This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

## Merge Details
### Merge Method

This model was merged with a multi-step method using the <b>DELLA</b>, <b>DELLA_LINEAR</b> and <b>SLERP</b> merge algorithms.

### Models Merged

The following models were included in the merge:
* [nothingiisreal/MN-12B-Celeste-V1.9](https://huggingface.co/nothingiisreal/MN-12B-Celeste-V1.9)
* [shuttleai/shuttle-2.5-mini](https://huggingface.co/shuttleai/shuttle-2.5-mini)
* [anthracite-org/magnum-12b-v2](https://huggingface.co/anthracite-org/magnum-12b-v2)
* [Sao10K/MN-12B-Lyra-v1](https://huggingface.co/Sao10K/MN-12B-Lyra-v1)
* [unsloth/Mistral-Nemo-Instruct-2407](https://huggingface.co/unsloth/Mistral-Nemo-Instruct-2407)
* [NeverSleep/Lumimaid-v0.2-12B](https://huggingface.co/NeverSleep/Lumimaid-v0.2-12B)
* [UsernameJustAnother/Nemo-12B-Marlin-v5](https://huggingface.co/UsernameJustAnother/Nemo-12B-Marlin-v5)
* [BeaverAI/mistral-doryV2-12b](https://huggingface.co/BeaverAI/mistral-doryV2-12b)
* [invisietch/Atlantis-v0.1-12B](https://huggingface.co/invisietch/Atlantis-v0.1-12B)

### Configuration

The following YAML configuration was used to produce this model:

```yaml
#Step 1 (Part1)
models:
  - model: Sao10K/MN-12B-Lyra-v1
    parameters:
      weight: 0.15
      density: 0.77
  - model: shuttleai/shuttle-2.5-mini
    parameters:
      weight: 0.20
      density: 0.78
  - model: anthracite-org/magnum-12b-v2
    parameters:
      weight: 0.35
      density: 0.85
  - model: nothingiisreal/MN-12B-Celeste-V1.9
    parameters:
      weight: 0.55
      density: 0.90
merge_method: della
base_model: Sao10K/MN-12B-Lyra-v1
parameters:
  int8_mask: true
  epsilon: 0.05
  lambda: 1
dtype: bfloat16
#Step 2 (Part2)
models:
  - model: BeaverAI/mistral-doryV2-12b
    parameters:
      weight: 0.10
      density: 0.4
  - model: unsloth/Mistral-Nemo-Instruct-2407
    parameters:
      weight: 0.20
      density: 0.4
  - model: UsernameJustAnother/Nemo-12B-Marlin-v5
    parameters:
      weight: 0.25
      density: 0.5
  - model: invisietch/Atlantis-v0.1-12B
    parameters:
      weight: 0.3
      density: 0.5
  - model: NeverSleep/Lumimaid-v0.2-12B
    parameters:
      weight: 0.4
      density: 0.8
merge_method: della_linear
base_model: anthracite-org/magnum-12b-v2
parameters:
  int8_mask: true
  epsilon: 0.05
  lambda: 1
dtype: bfloat16
#Step 3 (Estrella)
slices:
  - sources:
      - model: v000000/MN-12B-Part2
        layer_range: [0, 40]
      - model: v000000/MN-12B-Part1
        layer_range: [0, 40]
merge_method: slerp
base_model: v000000/MN-12B-Part1
parameters: #smooth gradient prio part1
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 0.6, 0.1, 0.6, 0.3, 0.8, 0.5]
    - filter: mlp
      value: [0, 0.5, 0.4, 0.3, 0, 0.3, 0.4, 0.7, 0.2, 0.5]
    - value: 0.5
dtype: bfloat16
```