File size: 4,304 Bytes
8a0d892 c6ed031 8d7114c 8a0d892 c6ed031 8d7114c c6ed031 e407f8d c6ed031 8d7114c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 |
---
base_model:
- SanjiWatsuki/Kunoichi-DPO-v2-7B
- SanjiWatsuki/Kunoichi-7B
library_name: transformers
tags:
- mergekit
- merge
license: cc-by-nc-4.0
---
# franken-kunoichi-IDUS-11B
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
The Interwoven Depth Up-Scaling merge formula was adapted from [Sanji Watsuki's longcat-10.7B](https://huggingface.co/SanjiWatsuki/longcat-10.7B).
I consider this to be a negative result, but perhaps an interesting one. I've tested casually with temperature 0.7-1.2 and minP 0.01-0.03, with both Alpaca and ChatML prompts. The resulting generated text is interesting for RP due to chaotic variation, mostly sticking to grammatically correct output, but easily veers too far into chaos (e.g., abruptly switching language) while having difficulty tracking things. Given that, the inherited 8K context length is of dubious benefit.
Maybe additional training might smooth the transition between models, but that hypothesis is untested.
## Merge Details
### Merge Method
This model was merged using the passthrough merge method.
### Models Merged
The following models were included in the merge:
* [SanjiWatsuki/Kunoichi-DPO-v2-7B](https://huggingface.co/SanjiWatsuki/Kunoichi-DPO-v2-7B)
* [SanjiWatsuki/Kunoichi-7B](https://huggingface.co/SanjiWatsuki/Kunoichi-7B)
### Configuration
The following YAML configuration was used to produce this model:
```yaml
slices:
- sources:
- model: SanjiWatsuki/Kunoichi-7B
layer_range: [0, 8]
- sources:
- model: SanjiWatsuki/Kunoichi-DPO-v2-7B
layer_range: [8, 9]
- sources:
- model: SanjiWatsuki/Kunoichi-7B
layer_range: [8, 9]
- sources:
- model: SanjiWatsuki/Kunoichi-DPO-v2-7B
layer_range: [9, 10]
- sources:
- model: SanjiWatsuki/Kunoichi-7B
layer_range: [9, 10]
- sources:
- model: SanjiWatsuki/Kunoichi-DPO-v2-7B
layer_range: [10, 11]
- sources:
- model: SanjiWatsuki/Kunoichi-7B
layer_range: [10, 11]
- sources:
- model: SanjiWatsuki/Kunoichi-DPO-v2-7B
layer_range: [11, 12]
- sources:
- model: SanjiWatsuki/Kunoichi-7B
layer_range: [11, 12]
- sources:
- model: SanjiWatsuki/Kunoichi-DPO-v2-7B
layer_range: [12, 13]
- sources:
- model: SanjiWatsuki/Kunoichi-7B
layer_range: [12, 13]
- sources:
- model: SanjiWatsuki/Kunoichi-DPO-v2-7B
layer_range: [13, 14]
- sources:
- model: SanjiWatsuki/Kunoichi-7B
layer_range: [13, 14]
- sources:
- model: SanjiWatsuki/Kunoichi-DPO-v2-7B
layer_range: [14, 15]
- sources:
- model: SanjiWatsuki/Kunoichi-7B
layer_range: [14, 15]
- sources:
- model: SanjiWatsuki/Kunoichi-DPO-v2-7B
layer_range: [15, 16]
- sources:
- model: SanjiWatsuki/Kunoichi-7B
layer_range: [15, 16]
- sources:
- model: SanjiWatsuki/Kunoichi-DPO-v2-7B
layer_range: [16, 17]
- sources:
- model: SanjiWatsuki/Kunoichi-7B
layer_range: [16, 17]
- sources:
- model: SanjiWatsuki/Kunoichi-DPO-v2-7B
layer_range: [17, 18]
- sources:
- model: SanjiWatsuki/Kunoichi-7B
layer_range: [17, 18]
- sources:
- model: SanjiWatsuki/Kunoichi-DPO-v2-7B
layer_range: [18, 19]
- sources:
- model: SanjiWatsuki/Kunoichi-7B
layer_range: [18, 19]
- sources:
- model: SanjiWatsuki/Kunoichi-DPO-v2-7B
layer_range: [19, 20]
- sources:
- model: SanjiWatsuki/Kunoichi-7B
layer_range: [19, 20]
- sources:
- model: SanjiWatsuki/Kunoichi-DPO-v2-7B
layer_range: [20, 21]
- sources:
- model: SanjiWatsuki/Kunoichi-7B
layer_range: [20, 21]
- sources:
- model: SanjiWatsuki/Kunoichi-DPO-v2-7B
layer_range: [21, 22]
- sources:
- model: SanjiWatsuki/Kunoichi-7B
layer_range: [21, 22]
- sources:
- model: SanjiWatsuki/Kunoichi-DPO-v2-7B
layer_range: [22, 23]
- sources:
- model: SanjiWatsuki/Kunoichi-7B
layer_range: [22, 23]
- sources:
- model: SanjiWatsuki/Kunoichi-DPO-v2-7B
layer_range: [23, 24]
- sources:
- model: SanjiWatsuki/Kunoichi-7B
layer_range: [23, 24]
- sources:
- model: SanjiWatsuki/Kunoichi-DPO-v2-7B
layer_range: [24, 32]
merge_method: passthrough
dtype: float16
``` |