Update README.md

a271a0f verified about 1 month ago

7.69 kB

	---
	language:
	- en
	license: cc-by-nc-4.0
	---

	Trained with compute from [Backyard.ai](https://backyard.ai/) \| Thanks to them and @dynafire for helping me out.

	Trained on 2x A100 SXM 40GB as an 8-bit LoRA.

	---

	This is a group-chat based roleplaying model, based off of [12B-Lyra-v4a2](https://huggingface.co/Sao10K/MN-12B-Lyra-v4a2), a variant of Lyra-v4 that is currently private.

	It is trained on an entirely human-based dataset, based on forum / internet group roleplaying styles. The only augmentation done with LLMs is to the character sheets, to fit to the system prompt, to fit various character sheets within context.

	This model is still capable of 1 on 1 roleplay, though I recommend using ChatML when doing that instead.

	---

	# Formatting:

	Training for the multi-character roleplaying format is done with a variant of ChatML, replaced with [INST] blocks formatted as such. Use this to draw in more of the training done.
	```
	[INST]system
	System Prompt Here[/INST]
	[INST]user
	User's Yapping[/INST]
	[INST]model
	Model Reply[/INST]
	```

	Relevant!
	<br> \- Turns do not need to respect `user -> model -> user`. Training is done with disjointed turns that may have repeating turns to simulate real group roleplay / chat scenarios with multiple users.
	<br> \- Additional work may be required to fit for your front-end.
	<br> \- Ideally character cards are all included in the turns. Training is done with this in mind. Below on the page has relevant information.
	<br> \- This is a Nemo model, so lower Temperature and a sprinkling of min_p helps.
	<br> \- This does require a lot of tinkering to fit within SillyTavern / other frontends.

	To get better performance on Regular 1 on 1 Roleplay or Chat scenarios, use ChatML to get more of Lyra's performance.
	```
	<\|im_start\|>system
	System Prompt Here.<\|im_end\|>
	<\|im_start\|>user
	User's Instructions<\|im_end\|>
	<\|im_start\|>assistant
	Model Response<\|im_end\|>
	```

	For best results, set both `<\|im_end\|>` and `[INST]` as stopping strings.
	Recommended Temperature is <1 , min_p of ateast 0.1

	---

	# Dataset Information:

	This dataset is made from a human RP forum source, trimmed down, augmented and reformatted to fit.
	<br> \- Each entry has a minimum of 6 turns to be inside
	<br> \- Number of unique/main characters are ranged from 2 to 7 characters per entry.
	<br> \- Each conversation is kept as is to preserve quality and uniqueness of the human data.
	<br> \- Only the added system prompt makes use of the current character sheets given.

	The following below is how the current Character Card / Sheets is done, which are augmented from the messy and non-uniform character sheets available. To get best results, please reformat your current character data to the on as seen below, or as similar as you can if possible.
	```
	- Character Name:
	- Age:
	- Race:
	- Mageblood Type: (if applicable)
	- Favored Magic Class: (if applicable)
	- Previous Magic Training: (if applicable)
	- Occupation/Profession: (if applicable)
	- Appearance: (if applicable)
	- Biography: (if applicable)
	- Good Attributes: (if applicable)
	- Bad Attributes: (if applicable)
	- Equipment: (if applicable)
	- Other Information: (if applicable)
	```

	Here is an example based on the above format:

	```
	Character Name: Keri Wolf
	Age: 21
	Race: Vampire
	Mageblood Type: Hydromancy
	Favored Magic Class: Aqua
	Previous Magic Training: Novice
	Occupation/Profession: None specified

	Appearance:
	- Height: 5'9"
	- A wooden wolf necklace around her neck, contrasting with her pale skin
	- Three swords strapped to her waist
	- A tattoo of a thorn vine, her family crest, on her right arm
	- Normal eye color is red but changes based on her mood or the topic of conversation
	- Carries a hunk of wood and a carving knife for personal activities

	Biography:
	Keri Wolf grew up in a family of adopted siblings in Djarkel. She had a normal childhood, with her best friend Satori, and was taught basic self-defense by her father. Her brothers were considered troublemakers but remained close to her. On her 21st birthday, her family was slaughtered by a vampire nest, and she was bitten. This led to her developing vampiric traits and seeking answers at the college.

	Good Attributes:
	- Easy-going
	- Observant
	- Helps those in trouble
	- Soft-hearted
	- Kind
	- Cool-headed
	- Good at getting out of difficult situations
	- Avoids violence
	- Gets along well with different people
	- Loves animals

	Bad Attributes:
	- Sunlight sensitivity
	- Hatred towards vampires outside the college
	- Keeps feelings in check, leading to dangerous outbursts
	- Cruel manner of speaking
	- Thirst for revenge

	Equipment:
	- Wooden wolf necklace
	- Three swords (one engraved with a rose, one engraved with her father's name, and one for decoration)
	- Carving knife
	- Hunk of wood
	- Stealth Ring
	- Knight's Shield

	Other Information:
	- Secret word: rebirth
	```

	The following system prompt is augmented from available character sheets, or details from the original dataset. Placeholder names are given as shown.

	```
	You are involved in a multi-character internet-style roleplaying session with a human user, who is playing as Ballbuster Steve. Do not generate dialogue for the user's character, Ballbuster Steve. Focus on the other characters.

	[Human User]
	Ballbuster Steve # {user}
	Character Bio: [Steve's bio]

	[Involved Characters]
	Altair "Arty" Enzo # {char1}
	Character Bio: [Arty's bio]
	---
	Sukuna Gojo # {char2}
	Character Bio: [Sukuna's bio]
	---

	The roleplay begins now.
	```

	This is how some of the turn example looks like, newlines are only for visual use.

	```
	[INST]user
	Ballbuster Steve: Being the doorman at a nightclub, especially one as popular as LUSH... [/INST]

	[INST]model
	Altair "Arty" Enzo: While he was waiting for Jake to answer, Arty noticed from the corner of his eye... [/INST]

	[INST]model
	Sukuna Gojo: Nick was now out of his element; he just came off his portable radio app... [/INST]

	[INST]user
	Ballbuster Steve: Steve grabbed his black clutch from where it was stashed under the mixing desk... [/INST]
	```

	To make it easier, this is how I'd format responses for the backend:

	```
	<s>[INST]system
	{system_prompt}[/INST]
	[INST]user
	{user}: {text}[/INST]
	[INST]model
	{char1}: {text}[/INST]
	[INST]model
	{char2}: {text}[/INST]
	[INST]user
	{user}: {text}[/INST]
	[INST]model
	{char1}: {text}[/INST]<\|im_end\|> # For Final Turn only. Alternatively, set <\|im_end\|> as a stopping string.
	```

	---

	# Current Issues:
	```
	- Impersonation - This is a common side-effect of pure human roleplaying data, unfortunately.
	Users do like writing the actions of others, though this is more limited to end of reply.
	- Varied Output Quality - A swipe should be enough?
	I only removed obviously bad entries. Output quality varies thanks to the variety of human users involved.
	- Character Detail Confusion when in group chats
	This rarely happens, but it is usually when there are too many main characters, or the bio is improperly formatted and seperated.
	Or if you're using an additional, complex system prompt.
	- Random OOC / Story Break moments may still exist despite me filtering the data.
	- Limited Dataset Size -> 4K Varied Samples ranging from 2-7 characters per entry. I'm looking to expand.
	- Limited System Prompt? -> I'm trying to improve on this.
	- Fantasy-bias? -> Most of the entries are fantasy-based after all.
	```





	---

	Training Metrics
	```
	n_sample: 4000
	n_gpu: 2
	global batch size: 12
	lora: bnb_8bit
	no. epochs: 3
	lr: 0.000004
	lr_scheduler: cosine
	deepspeed: zero2
	```