PEFT
Not-For-All-Audiences
kingbri commited on
Commit
3918a9c
1 Parent(s): ca997b4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +77 -1
README.md CHANGED
@@ -1,3 +1,79 @@
1
  ---
2
- license: agpl-3.0
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: apache-2.0
3
+ library_name: peft
4
+ tags:
5
+ - not-for-all-audiences
6
  ---
7
+
8
+ # LimaRP-ShareGPT-13b-loras
9
+
10
+ This is a repository of my Llama-2-13b Qlora checkpoints based from the LimaRP dataset converted to ShareGPT.
11
+
12
+ ## Disclaimer
13
+
14
+ This a **highly experimental** QLora test. If you want to use the LimaRP lora, please [look here instead](https://huggingface.co/lemonilia/limarp-llama2-v2). Lemonilia's Lora uses the Alpaca format.
15
+
16
+ ## Why?
17
+
18
+ LimaRP is a high-quality lora with a dataset of human RP examples. However, it can come on too strong and nuke the personality of a character with a weight of 1.0. Therefore, lower weights are required when merging models. We wanted to see what would when formatting the dataset using shareGPT, a format that supports turn-based conversations instead of Alpaca which requires newline hackery.
19
+
20
+ In addition, we wanted to see how various system prompts affect the end result of a lora finetune along with the use of character names as roles rather than the standard `USER` and `ASSISTANT`.
21
+
22
+ ## Roles
23
+ - kingbri: Rewrote dataset creation script, trained all loras, reformatted dataset.
24
+ - Alicat: Provided insight on system prompts and dataset formatting guidelines.
25
+ - Argo: Provided insight on system prompts.
26
+
27
+ ## Variance
28
+
29
+ For system prompts, please see the appropriate folder READMEs.
30
+
31
+ One char = Character's persona in system prompt.
32
+ Two char = Character and User's persona in system prompt.
33
+ The scenario is always included.
34
+
35
+ In addition, the dataprepare script randomizes a dataset before exporting it. This means that different portions were used for eval during training. Therefore each lora used a randomized version of LimaRP's 4k dataset. The randomization of what entries go to eval should not affect the results.
36
+
37
+ ## Notes
38
+
39
+ These Qloras were produced as an experiment to see how varying versions of LimaRP can affect a model. Please take this data with a grain of salt. You are able to test these yourself and decide from there.
40
+
41
+ ### Architecture
42
+
43
+ - **Model Architecture**: Llama-2-13b
44
+ - **Training Algorithm**: QLora
45
+
46
+ ### Training Details
47
+
48
+ - **Dataset**: LimaRP formatted with this [script](https://gist.github.com/bdashore3/4c9f3a812c1a68013fdb23e1179c7765)
49
+ - **Datset type**: ShareGPT
50
+ - **Training Parameters**: [See Here](https://gist.github.com/bdashore3/ab6cd21777a30fb9b131bc7b2f6b8949)
51
+ - **Training Environment**: Axolotl
52
+ - **sequence_len**: 4096
53
+
54
+ ## Instruct Format
55
+
56
+ ShareGPT gets converted to vicuna format. The roles were character names when training, so there's no set role of `USER` or `ASSISTANT`. You can probably use the following and it should work:
57
+
58
+ ```
59
+ SYSTEM: Enter roleplay mode...
60
+ User: {prompt}
61
+ Character:
62
+ ```
63
+
64
+ Not using instruct mode (preferred) is also an option.
65
+
66
+ ## Acknowledgments
67
+
68
+ Thanks to:
69
+ - Lemonilia: Original creator of the LimaRP dataset and Lora
70
+ - Axolotl: Finetuning suite
71
+
72
+ ## Donate?
73
+ All my infrastructure and cloud expenses are paid out of pocket. If you'd like to donate, you can do so here: [https://ko-fi.com/kingbri](https://ko-fi.com/kingbri)
74
+
75
+ You should not feel obligated to donate, but if you do, I'd appreciate it.
76
+
77
+ ## Axolotl stuff
78
+
79
+ All axolotl stuff is located within each lora folder