license: apache-2.0
library_name: peft
tags:
- not-for-all-audiences
LimaRP-ShareGPT-13b-loras
This is a repository of my Llama-2-13b Qlora checkpoints based from the LimaRP dataset converted to ShareGPT.
Disclaimer
This a highly experimental QLora test. If you want to use the LimaRP lora, please look here instead. Lemonilia's Lora uses the Alpaca format.
Why?
LimaRP is a high-quality lora with a dataset of human RP examples. However, it can come on too strong and nuke the personality of a character with a weight of 1.0. Therefore, lower weights are required when merging models. We wanted to see what would when formatting the dataset using shareGPT, a format that supports turn-based conversations instead of Alpaca which requires newline hackery.
In addition, we wanted to see how various system prompts affect the end result of a lora finetune along with the use of character names as roles rather than the standard USER
and ASSISTANT
.
Roles
- kingbri: Rewrote dataset creation script, trained all loras, reformatted dataset.
- Alicat: Provided insight on system prompts and dataset formatting guidelines.
- Argo: Provided insight on system prompts.
Variance
For system prompts, please see the appropriate folder READMEs.
One char = Character's persona in system prompt. Two char = Character and User's persona in system prompt. The scenario is always included.
In addition, the dataprepare script randomizes a dataset before exporting it. This means that different portions were used for eval during training. Therefore each lora used a randomized version of LimaRP's 4k dataset. The randomization of what entries go to eval should not affect the results.
Notes
These Qloras were produced as an experiment to see how varying versions of LimaRP can affect a model. Please take this data with a grain of salt. You are able to test these yourself and decide from there.
Architecture
- Model Architecture: Llama-2-13b
- Training Algorithm: QLora
Training Details
- Dataset: LimaRP formatted with this script
- Datset type: ShareGPT
- Training Parameters: See Here
- Training Environment: Axolotl
- sequence_len: 4096
Instruct Format
ShareGPT gets converted to vicuna format. The roles were character names when training, so there's no set role of USER
or ASSISTANT
. You can probably use the following and it should work:
SYSTEM: Enter roleplay mode...
User: {prompt}
Character:
Not using instruct mode (preferred) is also an option.
Acknowledgments
Thanks to:
- Lemonilia: Original creator of the LimaRP dataset and Lora
- Axolotl: Finetuning suite
Donate?
All my infrastructure and cloud expenses are paid out of pocket. If you'd like to donate, you can do so here: https://ko-fi.com/kingbri
You should not feel obligated to donate, but if you do, I'd appreciate it.
Axolotl stuff
All axolotl stuff is located within each lora folder