Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,79 @@
|
|
1 |
---
|
2 |
-
license:
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
license: apache-2.0
|
3 |
+
library_name: peft
|
4 |
+
tags:
|
5 |
+
- not-for-all-audiences
|
6 |
---
|
7 |
+
|
8 |
+
# LimaRP-ShareGPT-13b-loras
|
9 |
+
|
10 |
+
This is a repository of my Llama-2-13b Qlora checkpoints based from the LimaRP dataset converted to ShareGPT.
|
11 |
+
|
12 |
+
## Disclaimer
|
13 |
+
|
14 |
+
This a **highly experimental** QLora test. If you want to use the LimaRP lora, please [look here instead](https://huggingface.co/lemonilia/limarp-llama2-v2). Lemonilia's Lora uses the Alpaca format.
|
15 |
+
|
16 |
+
## Why?
|
17 |
+
|
18 |
+
LimaRP is a high-quality lora with a dataset of human RP examples. However, it can come on too strong and nuke the personality of a character with a weight of 1.0. Therefore, lower weights are required when merging models. We wanted to see what would when formatting the dataset using shareGPT, a format that supports turn-based conversations instead of Alpaca which requires newline hackery.
|
19 |
+
|
20 |
+
In addition, we wanted to see how various system prompts affect the end result of a lora finetune along with the use of character names as roles rather than the standard `USER` and `ASSISTANT`.
|
21 |
+
|
22 |
+
## Roles
|
23 |
+
- kingbri: Rewrote dataset creation script, trained all loras, reformatted dataset.
|
24 |
+
- Alicat: Provided insight on system prompts and dataset formatting guidelines.
|
25 |
+
- Argo: Provided insight on system prompts.
|
26 |
+
|
27 |
+
## Variance
|
28 |
+
|
29 |
+
For system prompts, please see the appropriate folder READMEs.
|
30 |
+
|
31 |
+
One char = Character's persona in system prompt.
|
32 |
+
Two char = Character and User's persona in system prompt.
|
33 |
+
The scenario is always included.
|
34 |
+
|
35 |
+
In addition, the dataprepare script randomizes a dataset before exporting it. This means that different portions were used for eval during training. Therefore each lora used a randomized version of LimaRP's 4k dataset. The randomization of what entries go to eval should not affect the results.
|
36 |
+
|
37 |
+
## Notes
|
38 |
+
|
39 |
+
These Qloras were produced as an experiment to see how varying versions of LimaRP can affect a model. Please take this data with a grain of salt. You are able to test these yourself and decide from there.
|
40 |
+
|
41 |
+
### Architecture
|
42 |
+
|
43 |
+
- **Model Architecture**: Llama-2-13b
|
44 |
+
- **Training Algorithm**: QLora
|
45 |
+
|
46 |
+
### Training Details
|
47 |
+
|
48 |
+
- **Dataset**: LimaRP formatted with this [script](https://gist.github.com/bdashore3/4c9f3a812c1a68013fdb23e1179c7765)
|
49 |
+
- **Datset type**: ShareGPT
|
50 |
+
- **Training Parameters**: [See Here](https://gist.github.com/bdashore3/ab6cd21777a30fb9b131bc7b2f6b8949)
|
51 |
+
- **Training Environment**: Axolotl
|
52 |
+
- **sequence_len**: 4096
|
53 |
+
|
54 |
+
## Instruct Format
|
55 |
+
|
56 |
+
ShareGPT gets converted to vicuna format. The roles were character names when training, so there's no set role of `USER` or `ASSISTANT`. You can probably use the following and it should work:
|
57 |
+
|
58 |
+
```
|
59 |
+
SYSTEM: Enter roleplay mode...
|
60 |
+
User: {prompt}
|
61 |
+
Character:
|
62 |
+
```
|
63 |
+
|
64 |
+
Not using instruct mode (preferred) is also an option.
|
65 |
+
|
66 |
+
## Acknowledgments
|
67 |
+
|
68 |
+
Thanks to:
|
69 |
+
- Lemonilia: Original creator of the LimaRP dataset and Lora
|
70 |
+
- Axolotl: Finetuning suite
|
71 |
+
|
72 |
+
## Donate?
|
73 |
+
All my infrastructure and cloud expenses are paid out of pocket. If you'd like to donate, you can do so here: [https://ko-fi.com/kingbri](https://ko-fi.com/kingbri)
|
74 |
+
|
75 |
+
You should not feel obligated to donate, but if you do, I'd appreciate it.
|
76 |
+
|
77 |
+
## Axolotl stuff
|
78 |
+
|
79 |
+
All axolotl stuff is located within each lora folder
|