PEFT
Not-For-All-Audiences
File size: 3,185 Bytes
f413d6b
3918a9c
 
 
 
f413d6b
3918a9c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
---
license: apache-2.0
library_name: peft
tags:
- not-for-all-audiences
---

# LimaRP-ShareGPT-13b-loras

This is a repository of my Llama-2-13b Qlora checkpoints based from the LimaRP dataset converted to ShareGPT.

## Disclaimer

This a **highly experimental** QLora test. If you want to use the LimaRP lora, please [look here instead](https://huggingface.co/lemonilia/limarp-llama2-v2). Lemonilia's Lora uses the Alpaca format.

## Why?

LimaRP is a high-quality lora with a dataset of human RP examples. However, it can come on too strong and nuke the personality of a character with a weight of 1.0. Therefore, lower weights are required when merging models. We wanted to see what would when formatting the dataset using shareGPT, a format that supports turn-based conversations instead of Alpaca which requires newline hackery.

In addition, we wanted to see how various system prompts affect the end result of a lora finetune along with the use of character names as roles rather than the standard `USER` and `ASSISTANT`.

## Roles
- kingbri: Rewrote dataset creation script, trained all loras, reformatted dataset.
- Alicat: Provided insight on system prompts and dataset formatting guidelines.
- Argo: Provided insight on system prompts.

## Variance

For system prompts, please see the appropriate folder READMEs.

One char = Character's persona in system prompt.
Two char = Character and User's persona in system prompt.
The scenario is always included.

In addition, the dataprepare script randomizes a dataset before exporting it. This means that different portions were used for eval during training. Therefore each lora used a randomized version of LimaRP's 4k dataset. The randomization of what entries go to eval should not affect the results.

## Notes

These Qloras were produced as an experiment to see how varying versions of LimaRP can affect a model. Please take this data with a grain of salt. You are able to test these yourself and decide from there.

### Architecture

- **Model Architecture**: Llama-2-13b
- **Training Algorithm**: QLora

### Training Details

- **Dataset**: LimaRP formatted with this [script](https://gist.github.com/bdashore3/4c9f3a812c1a68013fdb23e1179c7765)
- **Datset type**: ShareGPT
- **Training Parameters**: [See Here](https://gist.github.com/bdashore3/ab6cd21777a30fb9b131bc7b2f6b8949)
- **Training Environment**: Axolotl
- **sequence_len**: 4096

## Instruct Format

ShareGPT gets converted to vicuna format. The roles were character names when training, so there's no set role of `USER` or `ASSISTANT`. You can probably use the following and it should work:

```
SYSTEM: Enter roleplay mode...
User: {prompt}
Character:
```

Not using instruct mode (preferred) is also an option.

## Acknowledgments

Thanks to:
- Lemonilia: Original creator of the LimaRP dataset and Lora
- Axolotl: Finetuning suite

## Donate?
All my infrastructure and cloud expenses are paid out of pocket. If you'd like to donate, you can do so here: [https://ko-fi.com/kingbri](https://ko-fi.com/kingbri)

You should not feel obligated to donate, but if you do, I'd appreciate it.

## Axolotl stuff

All axolotl stuff is located within each lora folder