Darkknight535
/

Moonlight-L3-15B-v2.5-64k

Text Generation

Not-For-All-Audiences

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Moonlight-L3-15B-v2.5-64k / README.md

Darkknight535's picture

Update README.md

8b5d6a6 verified 21 days ago

|

history blame contribute delete

2.71 kB

	---
	base_model:
	- Darkknight535/Moonlight-L3-15B-v2-64k
	- Tremontaine/L3-Lunaris-v1-15B
	library_name: transformers
	tags:
	- mergekit
	- merge
	- not-for-all-audiences
	---
	### MoonLight-L3-15B-V2.5-64K
	---
	![Waifu](https://huggingface.co/Darkknight535/Moonlight-L3-15B-v2-64k/resolve/main/waifu.png)
	---
	- Enhancements:
	- Merging: Merge with lunaris to fix some issues. Note : This model is experimental so feedback is needed.
	---


	### Quants (Thanks to mradermacher)
	[Imatrix](https://huggingface.co/mradermacher/Moonlight-L3-15B-v2.5-64k-i1-GGUF)
	[Normal](https://huggingface.co/mradermacher/Moonlight-L3-15B-v2.5-64k-GGUF)
	### Instruct Template
	Default llama3 instruct and context preset, but here is the one i use.
	[Instruct](https://huggingface.co/Darkknight535/OpenCrystal-12B-L3/blob/main/Llama%203%20%5BInstruct%5D.json)
	[Context](https://huggingface.co/Darkknight535/OpenCrystal-12B-L3/blob/main/Llama%203%20%5BContext%5D.json)

	### Samplers


	## Creative
	```
	Temp : 1.2
	Min P : 0.01

	Repetition Penalty : 1.05
	OR
	Dry 0.7, 1.75, 2,0
	XTC 0.1, 0.5

	[And everything else neutral]
	```

	## Normal(Precise)
	```
	Temp : 0.6 - 0.8
	Min P : 0.1

	Repetition Penalty : 1.1
	OR
	Dry 1, 1.75, 2,0
	XTC 0.1, 0.2

	[And everything else neutral]
	```



	### System Prompt
	```
	Develop the plot slowly, always stay in character. Describe all actions in full, elaborate, explicit, graphic, and vivid detail. Mention all relevant sensory perceptions. Keep the story immersive and engaging. Speak as other person when needed and prefix with the name of person you're speaking as except {{user}}.
	```

	### FeedBack
	[FeedBack here](https://huggingface.co/Darkknight535/Moonlight-L3-15B-v2.5-64k/discussions/1)


	# merge

	This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

	## Merge Details
	### Merge Method

	This model was merged using the SLERP merge method.

	### Models Merged

	The following models were included in the merge:
	* [Darkknight535/Moonlight-L3-15B-v2-64k](https://huggingface.co/Darkknight535/Moonlight-L3-15B-v2-64k)
	* [Tremontaine/L3-Lunaris-v1-15B](https://huggingface.co/Tremontaine/L3-Lunaris-v1-15B)

	### Configuration

	The following YAML configuration was used to produce this model:

	```yaml
	slices:
	- sources:
	- model: Darkknight535/Moonlight-L3-15B-v2-64k
	layer_range: [0, 64]
	- model: Tremontaine/L3-Lunaris-v1-15B
	layer_range: [0, 64]

	merge_method: slerp
	base_model: Darkknight535/Moonlight-L3-15B-v2-64k
	parameters:
	t:
	- filter: self_attn
	value: [0, 0.5, 0.3, 0.7, 1]
	- filter: mlp
	value: [1, 0.5, 0.7, 0.3, 0]
	- value: 0.5 # fallback for rest of tensors
	dtype: bfloat16
	```