nsfw

Visual novel

roleplay

mergekit

Merge

conversational

Eval Results

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ChatWaifu_22B_v2.0_preview / README.md

spow12

Adding Evaluation Results (#1)

74ee746 verified about 1 month ago

preview code

raw

history blame contribute delete

7.24 kB

	---
	language:
	- en
	- ja
	license: cc-by-nc-4.0
	library_name: transformers
	tags:
	- nsfw
	- Visual novel
	- roleplay
	- mergekit
	- merge
	base_model:
	- mistralai/Mistral-Small-Instruct-2409
	datasets:
	- roleplay4fun/aesir-v1.1
	- kalomaze/Opus_Instruct_3k
	- Gryphe/Sonnet3.5-SlimOrcaDedupCleaned
	- Aratako/Synthetic-Japanese-Roleplay-gpt-4o-mini-39.6k-formatted
	- Aratako/Synthetic-Japanese-Roleplay-NSFW-Claude-3.5s-15.3k-formatted
	- SkunkworksAI/reasoning-0.01
	pipeline_tag: text-generation
	model-index:
	- name: ChatWaifu_22B_v2.0_preview
	results:
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: IFEval (0-Shot)
	type: HuggingFaceH4/ifeval
	args:
	num_few_shot: 0
	metrics:
	- type: inst_level_strict_acc and prompt_level_strict_acc
	value: 67.45
	name: strict accuracy
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=spow12/ChatWaifu_22B_v2.0_preview
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: BBH (3-Shot)
	type: BBH
	args:
	num_few_shot: 3
	metrics:
	- type: acc_norm
	value: 45.49
	name: normalized accuracy
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=spow12/ChatWaifu_22B_v2.0_preview
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MATH Lvl 5 (4-Shot)
	type: hendrycks/competition_math
	args:
	num_few_shot: 4
	metrics:
	- type: exact_match
	value: 16.31
	name: exact match
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=spow12/ChatWaifu_22B_v2.0_preview
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: GPQA (0-shot)
	type: Idavidrein/gpqa
	args:
	num_few_shot: 0
	metrics:
	- type: acc_norm
	value: 8.72
	name: acc_norm
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=spow12/ChatWaifu_22B_v2.0_preview
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MuSR (0-shot)
	type: TAUR-Lab/MuSR
	args:
	num_few_shot: 0
	metrics:
	- type: acc_norm
	value: 3.53
	name: acc_norm
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=spow12/ChatWaifu_22B_v2.0_preview
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MMLU-PRO (5-shot)
	type: TIGER-Lab/MMLU-Pro
	config: main
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 33.2
	name: accuracy
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=spow12/ChatWaifu_22B_v2.0_preview
	name: Open LLM Leaderboard
	---

	# Model Card for Model ID

	![image](./cover_2.png)

	Merged model using [mergekit](https://github.com/arcee-ai/mergekit/tree/main/mergekit)

	This model aimed to act like visual novel character.

	## Merge Format

	```yaml
	models:
	- model: mistralai/Mistral-Small-Instruct-2409_SFT
	layer_range: [0, 56]
	- model: mistralai/Mistral-Small-Instruct-2409
	layer_range: [0, 56]
	merge_method: slerp
	base_model: mistralai/Mistral-Small-Instruct-2409_SFT
	parameters:
	t:
	- filter: self_attn
	value: [0, 0.5, 0.3, 0.7, 1]
	- filter: mlp
	value: [1, 0.5, 0.7, 0.3, 0]
	- value: 0.5 # fallback for rest of tensors
	dtype: bfloat16
	```

	# WaifuModel Collections

	- [TTS](https://huggingface.co/spow12/visual_novel_tts)
	- [Chat](https://huggingface.co/spow12/ChatWaifu_22B_v2.0)
	- [ASR](https://huggingface.co/spow12/Visual-novel-transcriptor)

	# Unified demo

	[WaifuAssistant](https://github.com/yw0nam/WaifuAssistant)

	# Update 2.0

	- 2024.09.23 Update 22B, Ver 2.0


	## Model Details

	### Model Description

	- Developed by: spow12(yw_nam)
	- Shared by : spow12(yw_nam)
	- Model type: CausalLM
	- Language(s) (NLP): japanese. English
	- Finetuned from model : [mistralai/Mistral-Small-Instruct-2409](https://huggingface.co/mistralai/Mistral-Small-Instruct-2409)

	Currently, chatbot has below personality.

	character \| visual_novel \|
	--- \| --- \|
	ムラサメ \| Senren＊Banka \|
	茉子 \| Senren＊Banka \|
	芳乃 \| Senren＊Banka \|
	レナ \| Senren＊Banka \|
	千咲 \| Senren＊Banka \|
	芦花 \| Senren＊Banka \|
	愛衣 \| Café Stella and the Reaper's Butterflies \|
	栞那 \| Café Stella and the Reaper's Butterflies \|
	ナツメ \| Café Stella and the Reaper's Butterflies \|
	希 \| Café Stella and the Reaper's Butterflies \|
	涼音 \| Café Stella and the Reaper's Butterflies \|
	あやせ \| Riddle Joker \|
	七海 \| Riddle Joker \|
	羽月 \| Riddle Joker \|
	茉優 \| Riddle Joker \|
	小春 \| Riddle Joker \|

	But you can chat your own Character with persona text.

	Feel free to test.

	Your feedback will be helpful for improving model.
	### Dataset

	Riddle Joker(Prviate)

	Café Stella and the Reaper's Butterflies(Private)

	Senren＊Banka(Private)

	roleplay4fun/aesir-v1.1

	kalomaze/Opus_Instruct_3k

	Gryphe/Sonnet3.5-SlimOrcaDedupCleaned

	Aratako/Synthetic-JP-EN-Coding-Dataset-567k (only using 50000 sample)

	Aratako/Synthetic-Japanese-Roleplay-gpt-4o-mini-39.6k-formatted

	Aratako/Synthetic-Japanese-Roleplay-NSFW-Claude-3.5s-15.3k-formatted

	SkunkworksAI/reasoning-0.01

	### Feature

	- Fluent Chat performance
	- Reduce repetition problem when generate with many turn(over 20~30)
	- Zero Shot character persona using description of character.
	- 128k context window
	- Memory ability that does not forget even after long-context generation

	## Demo

	You can use Demo in google colab.

	Check [Here](https://colab.research.google.com/drive/194_FN28reEPTwS51dwpLLBBwEfeoBjP9?usp=sharing)

	## Bias, Risks, and Limitations

	This model can generate NSFW content.

	## Use & Credit

	This model is currently available for non-commercial & Research purpose only.

	Also, since I'm not detailed in licensing, I hope you use this model responsibly.

	By sharing this model, I hope to contribute to the research efforts of our community (the open-source community and Waifu Lovers).


	## Citation

	```bibtex
	@misc {ChatWaifu_22B_v2.0
	author = { YoungWoo Nam },
	title = { ChatWaifu_22B_v2.0_preview },
	year = 2024,
	url = { https://huggingface.co/spow12/ChatWaifu_22B_v2.0_preview },
	publisher = { Hugging Face }
	}
	```
	# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
	Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_spow12__ChatWaifu_22B_v2.0_preview)

	\| Metric \|Value\|
	\|-------------------\|----:\|
	\|Avg. \|29.12\|
	\|IFEval (0-Shot) \|67.45\|
	\|BBH (3-Shot) \|45.49\|
	\|MATH Lvl 5 (4-Shot)\|16.31\|
	\|GPQA (0-shot) \| 8.72\|
	\|MuSR (0-shot) \| 3.53\|
	\|MMLU-PRO (5-shot) \|33.20\|