|
--- |
|
license: apache-2.0 |
|
library_name: peft |
|
tags: |
|
- not-for-all-audiences |
|
--- |
|
|
|
# LimaRP-ShareGPT-13b-loras |
|
|
|
This is a repository of my Llama-2-13b Qlora checkpoints based from the LimaRP dataset converted to ShareGPT. |
|
|
|
## Disclaimer |
|
|
|
This a **highly experimental** QLora test. If you want to use the LimaRP lora, please [look here instead](https://huggingface.co/lemonilia/limarp-llama2-v2). Lemonilia's Lora uses the Alpaca format. |
|
|
|
## Why? |
|
|
|
LimaRP is a high-quality lora with a dataset of human RP examples. However, it can come on too strong and nuke the personality of a character with a weight of 1.0. Therefore, lower weights are required when merging models. We wanted to see what would when formatting the dataset using shareGPT, a format that supports turn-based conversations instead of Alpaca which requires newline hackery. |
|
|
|
In addition, we wanted to see how various system prompts affect the end result of a lora finetune along with the use of character names as roles rather than the standard `USER` and `ASSISTANT`. |
|
|
|
## Roles |
|
- kingbri: Rewrote dataset creation script, trained all loras, reformatted dataset. |
|
- Alicat: Provided insight on system prompts and dataset formatting guidelines. |
|
- Argo: Provided insight on system prompts. |
|
|
|
## Variance |
|
|
|
For system prompts, please see the appropriate folder READMEs. |
|
|
|
One char = Character's persona in system prompt. |
|
Two char = Character and User's persona in system prompt. |
|
The scenario is always included. |
|
|
|
In addition, the dataprepare script randomizes a dataset before exporting it. This means that different portions were used for eval during training. Therefore each lora used a randomized version of LimaRP's 4k dataset. The randomization of what entries go to eval should not affect the results. |
|
|
|
## Notes |
|
|
|
These Qloras were produced as an experiment to see how varying versions of LimaRP can affect a model. Please take this data with a grain of salt. You are able to test these yourself and decide from there. |
|
|
|
### Architecture |
|
|
|
- **Model Architecture**: Llama-2-13b |
|
- **Training Algorithm**: QLora |
|
|
|
### Training Details |
|
|
|
- **Dataset**: LimaRP formatted with this [script](https://gist.github.com/bdashore3/4c9f3a812c1a68013fdb23e1179c7765) |
|
- **Datset type**: ShareGPT |
|
- **Training Parameters**: [See Here](https://gist.github.com/bdashore3/ab6cd21777a30fb9b131bc7b2f6b8949) |
|
- **Training Environment**: Axolotl |
|
- **sequence_len**: 4096 |
|
|
|
## Instruct Format |
|
|
|
ShareGPT gets converted to vicuna format. The roles were character names when training, so there's no set role of `USER` or `ASSISTANT`. You can probably use the following and it should work: |
|
|
|
``` |
|
SYSTEM: Enter roleplay mode... |
|
User: {prompt} |
|
Character: |
|
``` |
|
|
|
Not using instruct mode (preferred) is also an option. |
|
|
|
## Acknowledgments |
|
|
|
Thanks to: |
|
- Lemonilia: Original creator of the LimaRP dataset and Lora |
|
- Axolotl: Finetuning suite |
|
|
|
## Donate? |
|
All my infrastructure and cloud expenses are paid out of pocket. If you'd like to donate, you can do so here: [https://ko-fi.com/kingbri](https://ko-fi.com/kingbri) |
|
|
|
You should not feel obligated to donate, but if you do, I'd appreciate it. |
|
|
|
## Axolotl stuff |
|
|
|
All axolotl stuff is located within each lora folder |
|
|