kromvault
/

L3.1-Siithamo-v0.2-8B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

L3.1-Siithamo-v0.2-8B / README.md

kromeurus's picture

Update README.md

d81b4ee verified 3 months ago

|

3.46 kB

	---
	base_model:
	- ArliAI/ArliAI-Llama-3-8B-Formax-v1.0
	- Sao10K/L3.1-8B-Niitama-v1.1
	- Sao10K/L3-8B-Tamamo-v1
	- Sao10K/L3-8B-Stheno-v3.3-32K
	- Edgerunners/Lyraea-large-llama-3.1
	- gradientai/Llama-3-8B-Instruct-Gradient-1048k
	library_name: transformers
	tags:
	- mergekit
	- merge
	---
	Second (third) time's the charm. After fighting with Formax trying to increase it's max context to something that isn't 4k, spat out this merge as a result. Still maintains a
	lot of v0.1's properties; creativity, literacy, and chattiness. Knowing everything I've learned making this, time to dive headfirst into making an L3.1 space whale.

	I stg LLMs are testing me.

	### Quants

	[OG Q8 GGUF](https://huggingface.co/kromquant/L3.1-Siithamo-v0.2b-8B-Q8-GGUF) by me.

	### Details & Recommended Settings

	(Still testing; details subject to change)

	Outputs a lot, pretty chatty like Stheno. Pulls some chaotic creativity from Niitama but its mellowed out with Tamamo. A little cliche writing, but it's almost endearing in a way.
	Sticks to instructs fairly well and changes to match {user}'s input in length and verbosity at times. Well balanced in all RP uses.

	I've tested this model to get up to 8-9k without any repitition, but idk what the true context limit of this model is yet.

	Rec. Settings:
	```
	Template: L3
	Temperature: 1.4
	Min P: 0.1
	Repeat Penalty: 1.05
	Repeat Penalty Tokens: 256
	```

	### Models Merged & Merge Theory

	The following models were included in the merge:
	* [Edgerunners/Lyraea-large-llama-3.1](https://huggingface.co/Edgerunners/Lyraea-large-llama-3.1)
	* [Sao10K/L3-8B-Stheno-v3.3-32K](https://huggingface.co/Sao10K/L3-8B-Stheno-v3.3-32K)
	* [Sao10K/L3.1-8B-Niitama-v1.1](https://huggingface.co/Sao10K/L3.1-8B-Niitama-v1.1)
	* [Sao10K/L3-8B-Tamamo-v1](https://huggingface.co/Sao10K/L3-8B-Tamamo-v1)
	* [ArliAI/ArliAI-Llama-3-8B-Formax-v1.0](https://huggingface.co/ArliAI/ArliAI-Llama-3-8B-Formax-v1.0)
	* [gradientai/Llama-3-8B-Instruct-Gradient-1048k](https://huggingface.co/gradientai/Llama-3-8B-Instruct-Gradient-1048k)

	Compared to v0.1, the siithamol3.1 part stayed the same. To 'increase' the context of Formax, just chopped of the ladder half and replaced it with a ~1M context model and that
	seemed to do the trick (after doing a bunch of other shit, this was the simplest and easiest route). Then, changed from dare_linear to breadcrumbs for the final merge, gave a
	better output without the hassle. Again, TIES anything didn't work nearly as well.

	### Config

	```yaml
	slices:
	- sources:
	- layer_range: [0, 16]
	model: ArliAI/ArliAI-Llama-3-8B-Formax-v1.0
	- sources:
	- layer_range: [16, 32]
	model: gradientai/Llama-3-8B-Instruct-Gradient-1048k
	parameters:
	int8_mask: true
	merge_method: passthrough
	dtype: float32
	out_dtype: bfloat16
	name: formax.ext
	---
	models:
	- model: Sao10K/L3.1-8B-Niitama-v1.1
	- model: Sao10K/L3-8B-Stheno-v3.3-32K
	- model: Sao10K/L3-8B-Tamamo-v1
	base_model: Edgerunners/Lyraea-large-llama-3.1
	parameters:
	normalize: false
	int8_mask: true
	merge_method: model_stock
	dtype: float32
	out_dtype: bfloat16
	name: siithamol3.1
	---
	models:
	- model: siitamol3.1
	parameters:
	weight: [0.5, 0.8, 0.9, 1]
	density: 0.9
	gamma: 0.01
	- model: formax.ext
	parameters:
	weight: [0.5, 0.2, 0.1, 0]
	density: 0.9
	gamma: 0.01
	base_model: siitamol3.1
	parameters:
	normalize: false
	int8_mask: true
	merge_method: breadcrumbs
	dtype: float32
	out_dtype: bfloat16
	```