File size: 3,641 Bytes
9418fc8 9078482 9418fc8 15f4a19 e77f8ef 15f4a19 c5087e1 15f4a19 2133207 126be80 33c7602 126be80 6e7708d 8f664df 8a12658 07e1f4f 8b7a6dc 07e1f4f 9418fc8 7e918b2 9418fc8 4665775 9418fc8 e7ac0f7 4665775 e7ac0f7 4665775 e7ac0f7 9418fc8 decbab1 9418fc8 9078482 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 |
---
base_model:
- v000000/MN-12B-Part1
- v000000/MN-12B-Part2
library_name: transformers
tags:
- mergekit
- merge
- mistral
---
> [!WARNING]
> **Temperature:**<br>
> Mistral Nemo likes low temperature between 0.3-0.5
Mistral-Nemo-12B-Estrella-v1
---------------------------------------------------------------------
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64f74b6e6389380c77562762/MyveknmJhuj43YrukIDAU.png)
RP Model. Seems coherent and concise but also creative. Big merge using new DELLA technique.
<b>Prompt Format: Mistral Instruct / ChatML format.</b>
# <b>Quants</b>
* [Q6_K GGUF](https://huggingface.co/v000000/MN-12B-Estrella-v1-Q6_K-GGUF)
----------------------------------------------------------------------
## merge
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
## Merge Details
### Merge Method
This model was merged with a multi-step method using the <b>DELLA</b>, <b>DELLA_LINEAR</b> and <b>SLERP</b> merge algorithms.
### Models Merged
The following models were included in the merge:
* [nothingiisreal/MN-12B-Celeste-V1.9](https://huggingface.co/nothingiisreal/MN-12B-Celeste-V1.9)
* [shuttleai/shuttle-2.5-mini](https://huggingface.co/shuttleai/shuttle-2.5-mini)
* [anthracite-org/magnum-12b-v2](https://huggingface.co/anthracite-org/magnum-12b-v2)
* [Sao10K/MN-12B-Lyra-v1](https://huggingface.co/Sao10K/MN-12B-Lyra-v1)
* [unsloth/Mistral-Nemo-Instruct-2407](https://huggingface.co/unsloth/Mistral-Nemo-Instruct-2407)
* [NeverSleep/Lumimaid-v0.2-12B](https://huggingface.co/NeverSleep/Lumimaid-v0.2-12B)
* [UsernameJustAnother/Nemo-12B-Marlin-v5](https://huggingface.co/UsernameJustAnother/Nemo-12B-Marlin-v5)
* [BeaverAI/mistral-doryV2-12b](https://huggingface.co/BeaverAI/mistral-doryV2-12b)
* [invisietch/Atlantis-v0.1-12B](https://huggingface.co/invisietch/Atlantis-v0.1-12B)
### Configuration
The following YAML configuration was used to produce this model:
```yaml
#Step 1 (Part1)
models:
- model: Sao10K/MN-12B-Lyra-v1
parameters:
weight: 0.15
density: 0.77
- model: shuttleai/shuttle-2.5-mini
parameters:
weight: 0.20
density: 0.78
- model: anthracite-org/magnum-12b-v2
parameters:
weight: 0.35
density: 0.85
- model: nothingiisreal/MN-12B-Celeste-V1.9
parameters:
weight: 0.55
density: 0.90
merge_method: della
base_model: Sao10K/MN-12B-Lyra-v1
parameters:
int8_mask: true
epsilon: 0.05
lambda: 1
dtype: bfloat16
#Step 2 (Part2)
models:
- model: BeaverAI/mistral-doryV2-12b
parameters:
weight: 0.10
density: 0.4
- model: unsloth/Mistral-Nemo-Instruct-2407
parameters:
weight: 0.20
density: 0.4
- model: UsernameJustAnother/Nemo-12B-Marlin-v5
parameters:
weight: 0.25
density: 0.5
- model: invisietch/Atlantis-v0.1-12B
parameters:
weight: 0.3
density: 0.5
- model: NeverSleep/Lumimaid-v0.2-12B
parameters:
weight: 0.4
density: 0.8
merge_method: della_linear
base_model: anthracite-org/magnum-12b-v2
parameters:
int8_mask: true
epsilon: 0.05
lambda: 1
dtype: bfloat16
#Step 3 (Estrella)
slices:
- sources:
- model: v000000/MN-12B-Part2
layer_range: [0, 40]
- model: v000000/MN-12B-Part1
layer_range: [0, 40]
merge_method: slerp
base_model: v000000/MN-12B-Part1
parameters: #smooth gradient prio part1
t:
- filter: self_attn
value: [0, 0.5, 0.3, 0.7, 0.6, 0.1, 0.6, 0.3, 0.8, 0.5]
- filter: mlp
value: [0, 0.5, 0.4, 0.3, 0, 0.3, 0.4, 0.7, 0.2, 0.5]
- value: 0.5
dtype: bfloat16
``` |