Swallow-70b-RP / README.md

Upload 23 files

1e744ea verified 10 months ago

6.66 kB

	---
	base_model:
	- tokyotech-llm/Swallow-70b-instruct-hf
	- nitky/Swallow-70b-NVE-RP
	tags:
	- mergekit
	- merge
	language:
	- en
	- ja
	library_name: transformers
	pipeline_tag: text-generation
	license: llama2
	model_type: llama
	---
	# Swallow-70b-RP

	Important Notice:

	For personal and academic use only.

	## Description

	This model is suitable for role-playing and storytelling, but it's not a great model for multi-turn chat.

	This was created for personal and academic use only. This merge model uses only fine-tune models of Llama2, but some of the models used include those whose licenses for commercial use are unclear.

	If there is a license problem, the rights holder should contact me directly. No license changes will be made due to contact from others.

	## Test environment

	This model was tested using [text-generation-webui](https://github.com/oobabooga/text-generation-webui/tree/main). I use preset `simple-1` and `Null preset` for Generation.

	### Recommendation

	Use `simple-1` settings:
	- temperature: 0.7
	- top_p: 0.9
	- repetition_penalty: 1.15
	- top_k: 20

	### Tested `temperature` Range

	- temperature: 0.3 - 1.0

	### Tested `repetition_penalty` Range

	- repetition_penalty: 1.0 - 1.15

	## Prompt template

	### Swallow Style (Alpaca format)

	```
	以下に、あるタスクを説明する指示があり、それに付随する入力が更なる文脈を提供しています。リクエストを適切に完了するための回答を記述してください。

	### 指示:
	{instruction}

	### 応答:

	```

	Although not fully tested, [Doctor-Shotgun/lzlv-limarpv3-l2-70b](Doctor-Shotgun/lzlv-limarpv3-l2-70b) and [alac/Waxwing-Storytelling-70B-LoRA](https://huggingface.co/alac/Waxwing-Storytelling-70B-LoRA) prompt styles are also available.

	## Use the instruct model

	```
	import torch
	from transformers import AutoTokenizer, AutoModelForCausalLM

	model_name = "nitky/Swallow-70b-RP"

	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, low_cpu_mem_usage=True, device_map="auto", load_in_4bit = True)


	PROMPT_DICT = {
	"prompt_input": (
	"以下に、あるタスクを説明する指示があり、それに付随する入力が更なる文脈を提供しています。"
	"リクエストを適切に完了するための回答を記述してください。\n\n"
	"### 指示:\n{instruction}\n\n### 入力:\n{input}\n\n### 応答:"

	),
	"prompt_no_input": (
	"以下に、あるタスクを説明する指示があります。"
	"リクエストを適切に完了するための回答を記述してください。\n\n"
	"### 指示:\n{instruction}\n\n### 応答:"
	),
	}

	def create_prompt(instruction, input=None):
	"""
	Generates a prompt based on the given instruction and an optional input.
	If input is provided, it uses the 'prompt_input' template from PROMPT_DICT.
	If no input is provided, it uses the 'prompt_no_input' template.

	Args:
	instruction (str): The instruction describing the task.
	input (str, optional): Additional input providing context for the task. Default is None.

	Returns:
	str: The generated prompt.
	"""
	if input:
	# Use the 'prompt_input' template when additional input is provided
	return PROMPT_DICT["prompt_input"].format(instruction=instruction, input=input)
	else:
	# Use the 'prompt_no_input' template when no additional input is provided
	return PROMPT_DICT["prompt_no_input"].format(instruction=instruction)

	# Example usage
	instruction_example = "以下のトピックに関する詳細な情報を提供してください。"
	input_example = "東京工業大学の主なキャンパスについて教えてください"
	prompt = create_prompt(instruction_example, input_example)

	input_ids = tokenizer.encode(
	prompt,
	add_special_tokens=False,
	return_tensors="pt"
	)

	tokens = model.generate(
	input_ids.to(device=model.device),
	max_new_tokens=200,
	temperature=0.7,
	top_p=0.9,
	repetition_penalty=1.15,
	top_k=20,
	do_sample=True,
	)

	out = tokenizer.decode(tokens[0], skip_special_tokens=True)
	print(out)

	```

	## Merge Details
	### Merge Method

	This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) and the SLERP merge method using [tokyotech-llm/Swallow-70b-instruct-hf](https://huggingface.co/tokyotech-llm/Swallow-70b-instruct-hf) as a base.

	### Models Merged

	The following models were included in the merge:
	* [GOAT-AI/GOAT-70B-Storytelling](https://huggingface.co/GOAT-AI/GOAT-70B-Storytelling)
	* [dreamgen/opus-v0.5-70b](https://huggingface.co/dreamgen/opus-v0.5-70b)
	* [Doctor-Shotgun/lzlv-limarpv3-l2-70b](Doctor-Shotgun/lzlv-limarpv3-l2-70b)
	* [LoRA] [alac/Waxwing-Storytelling-70B-LoRA](https://huggingface.co/alac/Waxwing-Storytelling-70B-LoRA)

	### Configuration

	The command example:

	```bash
	# please change the path and options according to your environment
	mergekit-mega --cuda Swallow-70b-RP.yml ~/text-generation-webui/models
	```

	The following YAML configuration was used to produce this model:

	```yaml
	models:
	- model: tokyotech-llm/Swallow-70b-instruct-hf
	# no parameters necessary for base model
	- model: nitky/Swallow-70b-NVE-RP
	parameters:
	density: 1
	weight:
	- filter: mlp
	value: 0.1
	- filter: self_attn
	value: 0.4
	- value: 0 # fallback for rest of tensors.
	merge_method: dare_ties
	base_model: tokyotech-llm/Swallow-70b-instruct-hf
	dtype: bfloat16
	tokenizer_source: union
	name: Swallow-70b-RP-base
	---
	models:
	- model: tokyotech-llm/Swallow-70b-instruct-hf
	# no parameters necessary for base model
	- model: nitky/Swallow-70b-NVE-RP
	parameters:
	density: 1
	weight:
	- filter: mlp
	value: [0.4, 0.1, 0.4, 0.1, 0.4, 0.1, 0.4, 0.1, 0.1]
	- filter: self_attn
	value: [0.4, 0.4, 0.1, 0.4, 0.1, 0.4, 0.1, 0.4, 0.4]
	- value: 0 # fallback for rest of tensors.
	merge_method: dare_ties
	base_model: tokyotech-llm/Swallow-70b-instruct-hf
	dtype: bfloat16
	tokenizer_source: union
	name: Swallow-70b-RP-flavor
	---
	slices:
	- sources:
	- model: Swallow-70b-RP-base
	layer_range: [0, 80]
	- model: Swallow-70b-RP-flavor
	layer_range: [0, 80]
	merge_method: slerp
	base_model: Swallow-70b-RP-base
	parameters:
	t: # model stabilization
	- filter: self_attn
	value: [0, 0.5, 0.3, 0.7, 1]
	- filter: mlp
	value: [1, 0.5, 0.7, 0.3, 0]
	- value: 0.5 # fallback for rest of tensors
	dtype: bfloat16
	name: Swallow-70b-RP

	```