--- base_model: - ArliAI/ArliAI-Llama-3-8B-Formax-v1.0 - gradientai/Llama-3-8B-Instruct-Gradient-1048k - ArliAI/Llama-3.1-8B-ArliAI-Formax-v1.0 - Sao10K/L3.1-8B-Niitama-v1.1 - Sao10K/L3-8B-Stheno-v3.3-32K - tohur/natsumura-storytelling-rp-1.0-llama-3.1-8b - Sao10K/L3-8B-Tamamo-v1 - Edgerunners/Lyraea-large-llama-3.1 library_name: transformers tags: - mergekit - merge --- HUZZAH, a model that's actually good! Just took seven tries. Fixed spacial understanding and literacy, toned down a little of the clingy instruct following, and more turn based RP forward. ### Quants [A few GGUFs](https://huggingface.co/kromquant/L3.1-Sithamo-v0.4-8B-GGUFs) by me. ### Details & Recommended Settings (Still testing; details subject to change) Sticks to instructs well, dynamic writing, roleplay focused generations, and more solid intelligence. Less rambley though still outputs a bit of text. Has near perfect recall up to 32K. Be clear and explicit with model instructs, including the intended format (Asterix, quotes, etc). Rec. Settings: ``` Template: L3 Temperature: 1.3 Min P: 0.1 Repeat Penalty: 1.05 Repeat Penalty Tokens: 256 Dyn Temp: 0.9-1.05 at 0.1 Smooth Sampl: 0.18 ``` Rec. Model Instructs: ``` ### Instruction: {character} continues the text of a never ending slow-burn role-play. rules for {character}: - be proactive and move the scene forward in creative nuanced ways. - write actions in the third-person past-tense. - avoid speaking on {user}'s behalf. - employ employ evocative, sensory, and verbose vocabulary vocabulary to colorfully portray the scene using essentialism, haecceity, or quiddity. ``` ### Merge Theory This sucked. Repalce RP Hermes back with Edgerunners Lyraea and swapped Niitama with L3.1 Niitama. ### Config ```yaml slices: - sources: - layer_range: [0, 16] model: ArliAI/ArliAI-Llama-3-8B-Formax-v1.0 - sources: - layer_range: [16, 32] model: gradientai/Llama-3-8B-Instruct-Gradient-1048k parameters: int8_mask: true merge_method: passthrough dtype: float32 out_dtype: bfloat16 name: formax.ext --- models: - model: formax.ext parameters: weight: 1.1 base_model: ArliAI/Llama-3.1-8B-ArliAI-Formax-v1.0 parameters: normalize: false int8_mask: true merge_method: dare_linear dtype: float32 out_dtype: bfloat16 tokenizer_source: base name: formaxext.3.1 --- models: - model: Sao10K/L3.1-8B-Niitama-v1.1 parameters: weight: 0.5 - model: Sao10K/L3-8B-Stheno-v3.3-32K parameters: weight: 0.6 base_model: tohur/natsumura-storytelling-rp-1.0-llama-3.1-8b parameters: normalize: false int8_mask: true merge_method: dare_linear dtype: float32 out_dtype: bfloat16 tokenizer_source: base name: siith.3.1 --- models: - model: siith.3.1 - model: Sao10K/L3-8B-Tamamo-v1 base_model: Edgerunners/Lyraea-large-llama-3.1 parameters: normalize: false int8_mask: true merge_method: model_stock dtype: float32 out_dtype: bfloat16 name: siithamol3.1 --- models: - model: siithamol3.1 parameters: weight: [0.5, 0,8, 0.8, 0.9, 1] density: 0.9 gamma: 0.01 - model: formaxext.3.1 parameters: weight: [0.5, 0.2, 0.2, 0.1, 0] density: 0.9 gamma: 0.01 base_model: siithamol3.1 parameters: normalize: false int8_mask: true merge_method: breadcrumbs_ties dtype: float32 out_dtype: bfloat16 name: siithamov3 ```