Sao10K
/

MN-BackyardAI-Party-12B-v1

PyTorch

English

mistral

Model card Files Files and versions Community

Sao10K commited on Oct 1

Commit

205bdc1

•

1 Parent(s): 5f21692

Create README.md

Browse files

Files changed (1) hide show

README.md +213 -0

README.md ADDED Viewed

	@@ -0,0 +1,213 @@

+---
+language:
+- en
+license: cc-by-nc-4.0
+---
+Trained with compute from [Backyard.ai](https://backyard.ai/) | Thanks to them and **@dynafire** for helping me out.
+Trained on 2x A100 SXM 40GB as an 8-bit LoRA.
+---
+This is a group-chat based roleplaying model, based off of [12B-Lyra-v4a2](https://huggingface.co/Sao10K/MN-12B-Lyra-v4a2), a variant of Lyra-v4 that is currently private.
+It is trained on an entirely human-based dataset, based on forum / internet group roleplaying styles. The only augmentation is done to the system prompt, to fit various character sheets within context.
+---
+# Formatting:
+Training for the multi-character roleplaying format is done with a variant of ChatML, replaced with [INST] blocks formatted as such. Use this to draw in more of the training done.
+```
+[INST]system
+System Prompt Here[/INST]
+[INST]user
+User's Yapping[/INST]
+[INST]model
+Model Reply[/INST]
+```
+**Relevant!**
+<br> \- Turns do not need to respect `user -> model -> user`. Training is done with disjointed turns that may have repeating turns to simulate real group roleplay / chat scenarios with multiple users.
+<br> \- Additional work may be required to fit for your front-end.
+<br> \- Ideally character cards are all included in the turns. Training is done with this in mind. Below on the page has relevant information.
+<br> \- This is a Nemo model, so lower Temperature and a sprinkling of min_p helps.
+<br> \- This does require a lot of tinkering to fit within SillyTavern / other frontends.
+To get better performance on Regular 1 on 1 Roleplay or Chat scenarios, use ChatML to get more of Lyra's performance.
+```
+<|im_start|>system
+System Prompt Here.<|im_end|>
+<|im_start|>user
+User's Instructions<|im_end|>
+<|im_start|>assistant
+Model Response<|im_end|>
+```
+**For best results, set both `<|im_end|>` and `[INST]` as stopping strings.**
+---
+# Dataset Information:
+This dataset is made from a human RP forum source, trimmed down, augmented and reformatted to fit.
+<br> \- Each entry has a minimum of 6 turns to be inside
+<br> \- Number of unique/main characters are ranged from 2 to 7 characters per entry.
+<br> \- Each conversation is kept as is to preserve quality and uniqueness of the human data.
+<br> \- Only the added system prompt makes use of the current character sheets given.
+The following below is how the current Character Card / Sheets is done, which are augmented from the messy and non-uniform character sheets available. To get best results, please reformat your current character data to the on as seen below, or as similar as you can if possible.
+```
+- **Character Name**:
+- **Age**:
+- **Race**:
+- **Mageblood Type**: (if applicable)
+- **Favored Magic Class**: (if applicable)
+- **Previous Magic Training**: (if applicable)
+- **Occupation/Profession**: (if applicable)
+- **Appearance**: (if applicable)
+- **Biography**: (if applicable)
+- **Good Attributes**: (if applicable)
+- **Bad Attributes**: (if applicable)
+- **Equipment**: (if applicable)
+- **Other Information**: (if applicable)
+```
+Here is an example based on the above format:
+```
+**Character Name**: Keri Wolf
+**Age**: 21
+**Race**: Vampire
+**Mageblood Type**: Hydromancy
+**Favored Magic Class**: Aqua
+**Previous Magic Training**: Novice
+**Occupation/Profession**: None specified
+**Appearance**:
+- Height: 5'9"
+- A wooden wolf necklace around her neck, contrasting with her pale skin
+- Three swords strapped to her waist
+- A tattoo of a thorn vine, her family crest, on her right arm
+- Normal eye color is red but changes based on her mood or the topic of conversation
+- Carries a hunk of wood and a carving knife for personal activities
+**Biography**:
+Keri Wolf grew up in a family of adopted siblings in Djarkel. She had a normal childhood, with her best friend Satori, and was taught basic self-defense by her father. Her brothers were considered troublemakers but remained close to her. On her 21st birthday, her family was slaughtered by a vampire nest, and she was bitten. This led to her developing vampiric traits and seeking answers at the college.
+**Good Attributes**:
+- Easy-going
+- Observant
+- Helps those in trouble
+- Soft-hearted
+- Kind
+- Cool-headed
+- Good at getting out of difficult situations
+- Avoids violence
+- Gets along well with different people
+- Loves animals
+**Bad Attributes**:
+- Sunlight sensitivity
+- Hatred towards vampires outside the college
+- Keeps feelings in check, leading to dangerous outbursts
+- Cruel manner of speaking
+- Thirst for revenge
+**Equipment**:
+- Wooden wolf necklace
+- Three swords (one engraved with a rose, one engraved with her father's name, and one for decoration)
+- Carving knife
+- Hunk of wood
+- Stealth Ring
+- Knight's Shield
+**Other Information**:
+- Secret word: rebirth
+```
+The following system prompt is augmented from available character sheets, or details from the original dataset. Placeholder names are given as shown.
+```
+You are involved in a multi-character internet-style roleplaying session with a human user, who is playing as Ballbuster Steve. Do not generate dialogue for the user's character, Ballbuster Steve. Focus on the other characters.
+[Human User]
+Ballbuster Steve # {user}
+Character Bio: [Steve's bio]
+[Involved Characters]
+Altair "Arty" Enzo # {char1}
+Character Bio: [Arty's bio]
+---
+Sukuna Gojo # {char2}
+Character Bio: [Sukuna's bio]
+---
+The roleplay begins now.
+```
+This is how some of the turn example looks like, newlines are only for visual use.
+```
+[INST]user
+Ballbuster Steve: Being the doorman at a nightclub, especially one as popular as LUSH... [/INST]
+[INST]model
+Altair "Arty" Enzo: While he was waiting for Jake to answer, Arty noticed from the corner of his eye... [/INST]
+[INST]model
+Sukuna Gojo: Nick was now out of his element; he just came off his portable radio app... [/INST]
+[INST]user
+Ballbuster Steve: Steve grabbed his black clutch from where it was stashed under the mixing desk... [/INST]
+```
+To make it easier, this is how I'd format responses for the backend:
+```
+<s>[INST]system
+{system_prompt}[/INST]
+[INST]user
+{user}: {text}[/INST]
+[INST]model
+{char1}: {text}[/INST]
+[INST]model
+{char2}: {text}[/INST]
+[INST]user
+{user}: {text}[/INST]
+[INST]model
+{char1}: {text}[/INST]<|im_end|> # For Final Turn only. Alternatively, set <|im_end|> as a stopping string.
+```
+---
+ # Current Issues:
+```
+- Impersonation - This is a common side-effect of pure human roleplaying data, unfortunately.
+    Users do like writing the actions of others, though this is more limited to end of reply.
+- Varied Output Quality - A swipe should be enough?
+    I only removed obviously bad entries. Output quality varies thanks to the variety of human users involved.
+- Random OOC / Story Break moments may still exist despite me filtering the data.
+- Limited Dataset Size -> 4K Varied Samples ranging from 2-7 characters per entry. I'm looking to expand.
+- Limited System Prompt? -> I'm trying to improve on this.
+- Fantasy-bias? -> Most of the entries are fantasy-based after all.
+```
+---
+Training Metrics
+```
+n_sample: 4000
+n_gpu: 2
+global batch size: 12
+lora: bnb_8bit
+no. epochs: 3
+lr: 0.000004
+lr_scheduler: cosine
+deepspeed: zero2
+```