|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- openerotica/Lamia |
|
tags: |
|
- NSFW |
|
- Porn |
|
- Ecommerce |
|
- Roleplay |
|
- Summarization |
|
--- |
|
|
|
This is a combination of the pruned erotica-analysis data, freedom-rp, and a subest of Airoboros. |
|
|
|
The following Categories are what was taken out of the Airoborus datset and added to my own Lamia dataset: |
|
"roleplay", "unalignment", "editor", "writing", "detailed_writing", "stylized_response", "unalign", "cot", "song" |
|
|
|
I'm hoping that this can improve the models narrative/storywriting ability, logic, and intelligence, while reducing any potential inherent ethical "alignment" that may be present in the base mistral model from pretaining on Chat-GPT generated data. |
|
|
|
The format is Chatml, and the base model is Yarn Mistral which increases the context size to a true 16k+ rather than rellying on the sliding attention window. |