PEFT
Not-For-All-Audiences
kingbri's picture
Update README.md
3918a9c
metadata
license: apache-2.0
library_name: peft
tags:
  - not-for-all-audiences

LimaRP-ShareGPT-13b-loras

This is a repository of my Llama-2-13b Qlora checkpoints based from the LimaRP dataset converted to ShareGPT.

Disclaimer

This a highly experimental QLora test. If you want to use the LimaRP lora, please look here instead. Lemonilia's Lora uses the Alpaca format.

Why?

LimaRP is a high-quality lora with a dataset of human RP examples. However, it can come on too strong and nuke the personality of a character with a weight of 1.0. Therefore, lower weights are required when merging models. We wanted to see what would when formatting the dataset using shareGPT, a format that supports turn-based conversations instead of Alpaca which requires newline hackery.

In addition, we wanted to see how various system prompts affect the end result of a lora finetune along with the use of character names as roles rather than the standard USER and ASSISTANT.

Roles

  • kingbri: Rewrote dataset creation script, trained all loras, reformatted dataset.
  • Alicat: Provided insight on system prompts and dataset formatting guidelines.
  • Argo: Provided insight on system prompts.

Variance

For system prompts, please see the appropriate folder READMEs.

One char = Character's persona in system prompt. Two char = Character and User's persona in system prompt. The scenario is always included.

In addition, the dataprepare script randomizes a dataset before exporting it. This means that different portions were used for eval during training. Therefore each lora used a randomized version of LimaRP's 4k dataset. The randomization of what entries go to eval should not affect the results.

Notes

These Qloras were produced as an experiment to see how varying versions of LimaRP can affect a model. Please take this data with a grain of salt. You are able to test these yourself and decide from there.

Architecture

  • Model Architecture: Llama-2-13b
  • Training Algorithm: QLora

Training Details

  • Dataset: LimaRP formatted with this script
  • Datset type: ShareGPT
  • Training Parameters: See Here
  • Training Environment: Axolotl
  • sequence_len: 4096

Instruct Format

ShareGPT gets converted to vicuna format. The roles were character names when training, so there's no set role of USER or ASSISTANT. You can probably use the following and it should work:

SYSTEM: Enter roleplay mode...
User: {prompt}
Character:

Not using instruct mode (preferred) is also an option.

Acknowledgments

Thanks to:

  • Lemonilia: Original creator of the LimaRP dataset and Lora
  • Axolotl: Finetuning suite

Donate?

All my infrastructure and cloud expenses are paid out of pocket. If you'd like to donate, you can do so here: https://ko-fi.com/kingbri

You should not feel obligated to donate, but if you do, I'd appreciate it.

Axolotl stuff

All axolotl stuff is located within each lora folder