Sao10K commited on
Commit
205bdc1
1 Parent(s): 5f21692

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +213 -0
README.md ADDED
@@ -0,0 +1,213 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: cc-by-nc-4.0
5
+ ---
6
+
7
+ Trained with compute from [Backyard.ai](https://backyard.ai/) | Thanks to them and **@dynafire** for helping me out.
8
+
9
+ Trained on 2x A100 SXM 40GB as an 8-bit LoRA.
10
+
11
+ ---
12
+
13
+ This is a group-chat based roleplaying model, based off of [12B-Lyra-v4a2](https://huggingface.co/Sao10K/MN-12B-Lyra-v4a2), a variant of Lyra-v4 that is currently private.
14
+
15
+ It is trained on an entirely human-based dataset, based on forum / internet group roleplaying styles. The only augmentation is done to the system prompt, to fit various character sheets within context.
16
+
17
+ ---
18
+
19
+ # Formatting:
20
+
21
+ Training for the multi-character roleplaying format is done with a variant of ChatML, replaced with [INST] blocks formatted as such. Use this to draw in more of the training done.
22
+ ```
23
+ [INST]system
24
+ System Prompt Here[/INST]
25
+ [INST]user
26
+ User's Yapping[/INST]
27
+ [INST]model
28
+ Model Reply[/INST]
29
+ ```
30
+
31
+ **Relevant!**
32
+ <br> \- Turns do not need to respect `user -> model -> user`. Training is done with disjointed turns that may have repeating turns to simulate real group roleplay / chat scenarios with multiple users.
33
+ <br> \- Additional work may be required to fit for your front-end.
34
+ <br> \- Ideally character cards are all included in the turns. Training is done with this in mind. Below on the page has relevant information.
35
+ <br> \- This is a Nemo model, so lower Temperature and a sprinkling of min_p helps.
36
+ <br> \- This does require a lot of tinkering to fit within SillyTavern / other frontends.
37
+
38
+ To get better performance on Regular 1 on 1 Roleplay or Chat scenarios, use ChatML to get more of Lyra's performance.
39
+ ```
40
+ <|im_start|>system
41
+ System Prompt Here.<|im_end|>
42
+ <|im_start|>user
43
+ User's Instructions<|im_end|>
44
+ <|im_start|>assistant
45
+ Model Response<|im_end|>
46
+ ```
47
+
48
+ **For best results, set both `<|im_end|>` and `[INST]` as stopping strings.**
49
+
50
+ ---
51
+
52
+ # Dataset Information:
53
+
54
+ This dataset is made from a human RP forum source, trimmed down, augmented and reformatted to fit.
55
+ <br> \- Each entry has a minimum of 6 turns to be inside
56
+ <br> \- Number of unique/main characters are ranged from 2 to 7 characters per entry.
57
+ <br> \- Each conversation is kept as is to preserve quality and uniqueness of the human data.
58
+ <br> \- Only the added system prompt makes use of the current character sheets given.
59
+
60
+ The following below is how the current Character Card / Sheets is done, which are augmented from the messy and non-uniform character sheets available. To get best results, please reformat your current character data to the on as seen below, or as similar as you can if possible.
61
+ ```
62
+ - **Character Name**:
63
+ - **Age**:
64
+ - **Race**:
65
+ - **Mageblood Type**: (if applicable)
66
+ - **Favored Magic Class**: (if applicable)
67
+ - **Previous Magic Training**: (if applicable)
68
+ - **Occupation/Profession**: (if applicable)
69
+ - **Appearance**: (if applicable)
70
+ - **Biography**: (if applicable)
71
+ - **Good Attributes**: (if applicable)
72
+ - **Bad Attributes**: (if applicable)
73
+ - **Equipment**: (if applicable)
74
+ - **Other Information**: (if applicable)
75
+ ```
76
+
77
+ Here is an example based on the above format:
78
+
79
+ ```
80
+ **Character Name**: Keri Wolf
81
+ **Age**: 21
82
+ **Race**: Vampire
83
+ **Mageblood Type**: Hydromancy
84
+ **Favored Magic Class**: Aqua
85
+ **Previous Magic Training**: Novice
86
+ **Occupation/Profession**: None specified
87
+
88
+ **Appearance**:
89
+ - Height: 5'9"
90
+ - A wooden wolf necklace around her neck, contrasting with her pale skin
91
+ - Three swords strapped to her waist
92
+ - A tattoo of a thorn vine, her family crest, on her right arm
93
+ - Normal eye color is red but changes based on her mood or the topic of conversation
94
+ - Carries a hunk of wood and a carving knife for personal activities
95
+
96
+ **Biography**:
97
+ Keri Wolf grew up in a family of adopted siblings in Djarkel. She had a normal childhood, with her best friend Satori, and was taught basic self-defense by her father. Her brothers were considered troublemakers but remained close to her. On her 21st birthday, her family was slaughtered by a vampire nest, and she was bitten. This led to her developing vampiric traits and seeking answers at the college.
98
+
99
+ **Good Attributes**:
100
+ - Easy-going
101
+ - Observant
102
+ - Helps those in trouble
103
+ - Soft-hearted
104
+ - Kind
105
+ - Cool-headed
106
+ - Good at getting out of difficult situations
107
+ - Avoids violence
108
+ - Gets along well with different people
109
+ - Loves animals
110
+
111
+ **Bad Attributes**:
112
+ - Sunlight sensitivity
113
+ - Hatred towards vampires outside the college
114
+ - Keeps feelings in check, leading to dangerous outbursts
115
+ - Cruel manner of speaking
116
+ - Thirst for revenge
117
+
118
+ **Equipment**:
119
+ - Wooden wolf necklace
120
+ - Three swords (one engraved with a rose, one engraved with her father's name, and one for decoration)
121
+ - Carving knife
122
+ - Hunk of wood
123
+ - Stealth Ring
124
+ - Knight's Shield
125
+
126
+ **Other Information**:
127
+ - Secret word: rebirth
128
+ ```
129
+
130
+ The following system prompt is augmented from available character sheets, or details from the original dataset. Placeholder names are given as shown.
131
+
132
+ ```
133
+ You are involved in a multi-character internet-style roleplaying session with a human user, who is playing as Ballbuster Steve. Do not generate dialogue for the user's character, Ballbuster Steve. Focus on the other characters.
134
+
135
+ [Human User]
136
+ Ballbuster Steve # {user}
137
+ Character Bio: [Steve's bio]
138
+
139
+ [Involved Characters]
140
+ Altair "Arty" Enzo # {char1}
141
+ Character Bio: [Arty's bio]
142
+ ---
143
+ Sukuna Gojo # {char2}
144
+ Character Bio: [Sukuna's bio]
145
+ ---
146
+
147
+ The roleplay begins now.
148
+ ```
149
+
150
+ This is how some of the turn example looks like, newlines are only for visual use.
151
+
152
+ ```
153
+ [INST]user
154
+ Ballbuster Steve: Being the doorman at a nightclub, especially one as popular as LUSH... [/INST]
155
+
156
+ [INST]model
157
+ Altair "Arty" Enzo: While he was waiting for Jake to answer, Arty noticed from the corner of his eye... [/INST]
158
+
159
+ [INST]model
160
+ Sukuna Gojo: Nick was now out of his element; he just came off his portable radio app... [/INST]
161
+
162
+ [INST]user
163
+ Ballbuster Steve: Steve grabbed his black clutch from where it was stashed under the mixing desk... [/INST]
164
+ ```
165
+
166
+ To make it easier, this is how I'd format responses for the backend:
167
+
168
+ ```
169
+ <s>[INST]system
170
+ {system_prompt}[/INST]
171
+ [INST]user
172
+ {user}: {text}[/INST]
173
+ [INST]model
174
+ {char1}: {text}[/INST]
175
+ [INST]model
176
+ {char2}: {text}[/INST]
177
+ [INST]user
178
+ {user}: {text}[/INST]
179
+ [INST]model
180
+ {char1}: {text}[/INST]<|im_end|> # For Final Turn only. Alternatively, set <|im_end|> as a stopping string.
181
+ ```
182
+
183
+ ---
184
+
185
+ # Current Issues:
186
+ ```
187
+ - Impersonation - This is a common side-effect of pure human roleplaying data, unfortunately.
188
+ Users do like writing the actions of others, though this is more limited to end of reply.
189
+ - Varied Output Quality - A swipe should be enough?
190
+ I only removed obviously bad entries. Output quality varies thanks to the variety of human users involved.
191
+ - Random OOC / Story Break moments may still exist despite me filtering the data.
192
+ - Limited Dataset Size -> 4K Varied Samples ranging from 2-7 characters per entry. I'm looking to expand.
193
+ - Limited System Prompt? -> I'm trying to improve on this.
194
+ - Fantasy-bias? -> Most of the entries are fantasy-based after all.
195
+ ```
196
+
197
+
198
+
199
+
200
+
201
+ ---
202
+
203
+ Training Metrics
204
+ ```
205
+ n_sample: 4000
206
+ n_gpu: 2
207
+ global batch size: 12
208
+ lora: bnb_8bit
209
+ no. epochs: 3
210
+ lr: 0.000004
211
+ lr_scheduler: cosine
212
+ deepspeed: zero2
213
+ ```