|
--- |
|
base_model: |
|
- ArliAI/ArliAI-Llama-3-8B-Formax-v1.0 |
|
- Sao10K/L3.1-8B-Niitama-v1.1 |
|
- Sao10K/L3-8B-Tamamo-v1 |
|
- Sao10K/L3-8B-Stheno-v3.3-32K |
|
- Edgerunners/Lyraea-large-llama-3.1 |
|
library_name: transformers |
|
tags: |
|
- mergekit |
|
- merge |
|
|
|
--- |
|
My first foray into Llama 3.1 and just having fun with the merging process. Testing theories and such. |
|
|
|
Updated version with higher context [here](https://huggingface.co/kromeurus/L3.1-Siithamo-v0.2-8B). |
|
|
|
### Quants |
|
|
|
[OG Q8 GGUF](https://huggingface.co/kromquant/L3.1-Siithamo-v0.1-8B-Q8-GGUF) by me. |
|
|
|
### Details & Recommended Settings |
|
|
|
Unfortunaely, this model still double lines but its not as often. Dramatic as fuck at times. I haven't tested the context limit yet but I'm sure it suffered somehow. |
|
|
|
Outputs a lot, pretty chatty like Stheno. Pulls some chaotic creativity from Niitama but its mellowed out with Tamamo. A little cliche writing, but it's almost endearing in a way. |
|
Should follow instructs fine? Stunted a little compared to the original model, don't think that's a negative though. |
|
|
|
4K Max context even on L3.1 (DAMN U FORMAX) |
|
|
|
Rec. Settings: |
|
``` |
|
Template: L3 |
|
Temperature: 1.35 |
|
Min P: 0.1 |
|
Repeat Penalty: 1.05 |
|
Repeat Penalty Tokens: 256 |
|
``` |
|
|
|
### Models Merged & Merge Theory |
|
|
|
The following models were included in the merge: |
|
* [Edgerunners/Lyraea-large-llama-3.1](https://huggingface.co/Edgerunners/Lyraea-large-llama-3.1) |
|
* [Sao10K/L3-8B-Stheno-v3.3-32K](https://huggingface.co/Sao10K/L3-8B-Stheno-v3.3-32K) |
|
* [Sao10K/L3.1-8B-Niitama-v1.1](https://huggingface.co/Sao10K/L3.1-8B-Niitama-v1.1) |
|
* [Sao10K/L3-8B-Tamamo-v1](https://huggingface.co/Sao10K/L3-8B-Tamamo-v1) |
|
* [ArliAI/ArliAI-Llama-3-8B-Formax-v1.0](https://huggingface.co/ArliAI/ArliAI-Llama-3-8B-Formax-v1.0) |
|
|
|
Using Edgerunners Lyraea as the 3.1 base, model stock mereged L3.1 Niitama, Stheno 3.3, and Tamamo a top each other. Then trying to curb L3 tendencies and add some instruct following |
|
capabilities, added some Formax in a dare linear merge. At least for updating L3 to L3.1, doing TIES anything results in a 'shittier' model. |
|
|
|
### Config |
|
|
|
```yaml |
|
models: |
|
- model: Sao10K/L3.1-8B-Niitama-v1.1 |
|
- model: Sao10K/L3-8B-Stheno-v3.3-32K |
|
- model: Sao10K/L3-8B-Tamamo-v1 |
|
base_model: Edgerunners/Lyraea-large-llama-3.1 |
|
parameters: |
|
normalize: false |
|
int8_mask: true |
|
merge_method: model_stock |
|
dtype: float32 |
|
out_dtype: bfloat16 |
|
name: siitamol3.1 |
|
--- |
|
models: |
|
- model: ArliAI/ArliAI-Llama-3-8B-Formax-v1.0 |
|
parameters: |
|
weight: [0.5, 0.3, 0.2, 0.1] |
|
- model: siitamol3.1 |
|
parameters: |
|
weight: [0.5, 0.7, 0.8, 1] |
|
base_model: siitamol3.1 |
|
parameters: |
|
normalize: false |
|
int8_mask: true |
|
merge_method: dare_linear |
|
dtype: float32 |
|
out_dtype: bfloat16 |
|
``` |
|
|