My first foray into Llama 3.1 and just having fun with the merging process. Testing theories and such.
Updated version with higher context here.
Quants
OG Q8 GGUF by me.
Details & Recommended Settings
Unfortunaely, this model still double lines but its not as often. Dramatic as fuck at times. I haven't tested the context limit yet but I'm sure it suffered somehow.
Outputs a lot, pretty chatty like Stheno. Pulls some chaotic creativity from Niitama but its mellowed out with Tamamo. A little cliche writing, but it's almost endearing in a way. Should follow instructs fine? Stunted a little compared to the original model, don't think that's a negative though.
4K Max context even on L3.1 (DAMN U FORMAX)
Rec. Settings:
Template: L3
Temperature: 1.35
Min P: 0.1
Repeat Penalty: 1.05
Repeat Penalty Tokens: 256
Models Merged & Merge Theory
The following models were included in the merge:
- Edgerunners/Lyraea-large-llama-3.1
- Sao10K/L3-8B-Stheno-v3.3-32K
- Sao10K/L3.1-8B-Niitama-v1.1
- Sao10K/L3-8B-Tamamo-v1
- ArliAI/ArliAI-Llama-3-8B-Formax-v1.0
Using Edgerunners Lyraea as the 3.1 base, model stock mereged L3.1 Niitama, Stheno 3.3, and Tamamo a top each other. Then trying to curb L3 tendencies and add some instruct following capabilities, added some Formax in a dare linear merge. At least for updating L3 to L3.1, doing TIES anything results in a 'shittier' model.
Config
models:
- model: Sao10K/L3.1-8B-Niitama-v1.1
- model: Sao10K/L3-8B-Stheno-v3.3-32K
- model: Sao10K/L3-8B-Tamamo-v1
base_model: Edgerunners/Lyraea-large-llama-3.1
parameters:
normalize: false
int8_mask: true
merge_method: model_stock
dtype: float32
out_dtype: bfloat16
name: siitamol3.1
---
models:
- model: ArliAI/ArliAI-Llama-3-8B-Formax-v1.0
parameters:
weight: [0.5, 0.3, 0.2, 0.1]
- model: siitamol3.1
parameters:
weight: [0.5, 0.7, 0.8, 1]
base_model: siitamol3.1
parameters:
normalize: false
int8_mask: true
merge_method: dare_linear
dtype: float32
out_dtype: bfloat16
- Downloads last month
- 13