New model version trained with 4096 context length

Browse files

Files changed (2) hide show

Psyfighter2-13B-vore.Q4_K_M.gguf +2 -2
README.md +33 -26

Psyfighter2-13B-vore.Q4_K_M.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:8a1d9536c7245271758e7e846ba4e0d6b3061d295e0ca2777c04318b719ab8de
-size 7865956352

 version https://git-lfs.github.com/spec/v1
+oid sha256:917bc2644119397263ae0c3ead67a549c5fff31729c82928c99f75ba42370c4e
+size 7865956512

README.md CHANGED Viewed

@@ -7,12 +7,8 @@ inference: false
 tags:
   - storywriting
   - finetuned
-  - roleplay
-  - vore
   - not-for-all-audiences
   - gguf
-  - nsfw
-  - uncensored
 base_model: SnakyMcSnekFace/Psyfighter2-13B-vore
 model_type: llama
 prompt_template: >
@@ -32,7 +28,7 @@ prompt_template: >
 This is a quantized version of [SnakyMcSnekFace/Psyfighter2-13B-vore](https://huggingface.co/SnakyMcSnekFace/Psyfighter2-13B-vore) model.
-This model is a version of [KoboldAI/LLaMA2-13B-Psyfighter2](https://huggingface.co/KoboldAI/LLaMA2-13B-Psyfighter2) finetuned to better understand vore context. The primary purpose of this model is to be a storywriting assistant, as well as a conversational model in a chat.
 The Adventure Mode is still work in progress, and will be added later.
@@ -60,17 +56,24 @@ The easiest way to try out the model is [Koboldcpp Colab Notebook](https://colab
 - Paste the model URL into the field: `https://huggingface.co/SnakyMcSnekFace/Psyfighter2-13B-vore-GGUF/resolve/main/Psyfighter2-13B-vore.Q4_K_M.gguf`
 - Start the notebook, wait for the URL to CloudFlare tunnel to appear at the bottom and click it
 - Use the model as a writing assistant
-- You can try an adventure from [https://aetherroom.club/](https://aetherroom.club/), but keep in mind that the model will not let you take turn unless you stop it. Adventure mode is work-in-progress.
-### Faraday
-Another convenient way to use the model is [Faraday.dev](https://faraday.dev/) application, which allows you to run the model locally on your computer. You'll need a graphics card with at least 8GB VRAM to use `Q4_K_M` version comfortably, and 16GB VRAM for `Q8_0`. (`Q4_K_M` version is smaller and faster, `Q8_0` is slower but more coherent.)
-Download the [Psyfighter2-13B-vore.Q4_K_M.gguf](https://huggingface.co/SnakyMcSnekFace/Psyfighter2-13B-vore-GGUF/resolve/main/Psyfighter2-13B-vore.Q4_K_M.gguf) or [Psyfighter2-13B-vore.Q8_0.gguf](https://huggingface.co/SnakyMcSnekFace/Psyfighter2-13B-vore-GGUF/resolve/main/Psyfighter2-13B-vore.Q8_0.gguf) file into `%appdata%\faraday\models` folder on your computer. The model should appear in `Manage Models` menu under `Downloaded Models`. You can then select it in your character card or set it as a default model.
-### Others
-TBD
 ## Bias, Risks, and Limitations
@@ -78,27 +81,31 @@ By design, this model has a strong vorny bias. It's not intended for use by anyo
 ## Training Details
-This model was fine-tuned on free-form text comprised of stories focused around the vore theme using the [QLoRA method](https://arxiv.org/abs/2305.14314). The resulting adapter was merged into the base model. The quantized version of the model was prepared using [llama.cpp](https://github.com/ggerganov/llama.cpp).
 ### Training Procedure
-The model was fine-tuned using the [QLoRA method](https://arxiv.org/abs/2305.14314) on NVIDIA GeForce RTX 4060 Ti over the span of ~7 days. Training was performed using [text-generation-webui by oobabooga](https://github.com/oobabooga/text-generation-webui) with [Training PRO plug-in by FartyPants](https://github.com/FartyPants/Training_PRO).
-LoRa adapter configuration:
-- Rank: 512
-- Alpha: 1024
-- Dropout rate: 0.05
-- Target weights: v_prog, q_proj
-Training parameters:
-- Sample size: 768 tokens
-- Samples per epoch: 47420
 - Number of epochs: 2
-- First epoch: Learning rate = 3e-4, 1000 steps warmup, cosine schedule
-- Second epoch: Learning rate = 1e-4, 256 steps warmup, inverse sqrt schedule
 #### Preprocessing
@@ -106,7 +113,7 @@ The stories in dataset were pre-processed as follows:
 - titles, foreword, tags, and anything not comprising the text of the story was removed
 - non-ascii characters and character sequences serving as chapter separators were removed
-- any story mentioning underage personas was taken out of the dataset
 - names of private characters were replaced with randomized names across the dataset
 ## Environmental Impact
@@ -114,7 +121,7 @@ The stories in dataset were pre-processed as follows:
 Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
 - **Hardware Type:** NVIDIA GeForce RTX 4060 Ti
-- **Hours used:** 168
 - **Cloud Provider:** N/A
 - **Compute Region:** US-East
-- **Carbon Emitted:** 5.8 kg CO2 eq.

 tags:
   - storywriting
   - finetuned
   - not-for-all-audiences
   - gguf
 base_model: SnakyMcSnekFace/Psyfighter2-13B-vore
 model_type: llama
 prompt_template: >
 This is a quantized version of [SnakyMcSnekFace/Psyfighter2-13B-vore](https://huggingface.co/SnakyMcSnekFace/Psyfighter2-13B-vore) model.
+This model is a version of [KoboldAI/LLaMA2-13B-Psyfighter2](https://huggingface.co/KoboldAI/LLaMA2-13B-Psyfighter2) finetuned to better understand vore context. The primary purpose of this model is to be a storywriting assistant, a conversational model in a chat, and an interactive choose-your-own-adventure text game.
 The Adventure Mode is still work in progress, and will be added later.
 - Paste the model URL into the field: `https://huggingface.co/SnakyMcSnekFace/Psyfighter2-13B-vore-GGUF/resolve/main/Psyfighter2-13B-vore.Q4_K_M.gguf`
 - Start the notebook, wait for the URL to CloudFlare tunnel to appear at the bottom and click it
 - Use the model as a writing assistant
+- You can try an adventure from [https://aetherroom.club/](https://aetherroom.club/), but keep in mind that the model will not let you take turn unless you stop it. Adventure mode is still work-in-progress, but it's getting there.
+### Backyard AI
+Another convenient way to use the model is [Backyard AI](https://backyard.ai/) application, which allows you to run the model locally on your computer. You'll need a graphics card with at least 8GB VRAM to use the model comfortably.
+#### Download directly from HuggingFace (beta)
+In the left panel, click `Manage Models`, then select `Hugging face models`. Paste `https://huggingface.co/SnakyMcSnekFace/Psyfighter2-13B-vore-GGUF` into the text field and press `Fetch Models`. Click `Download` button to the next to the model format. Once the model is downloaded, you can select it in your character card or set it as a default model.
+#### Download manually
+Download the [Psyfighter2-13B-vore.Q4_K_M.gguf](https://huggingface.co/SnakyMcSnekFace/Psyfighter2-13B-vore-GGUF/resolve/main/Psyfighter2-13B-vore.Q4_K_M.gguf) file into `%appdata%\faraday\models` folder on your computer. The model should appear in `Manage Models` menu under `Downloaded Models`. You can then select it in your character card or set it as a default model.
+### Model updates
+- 04/13/2024 - uploaded the first version of the model
+- 05/25/2024 - updated training process, making the model more coherent and improving the writing quality
 ## Bias, Risks, and Limitations
 ## Training Details
+This model was fine-tuned on free-form text comprised of stories focused around the vore theme using [rank-stabilized](https://arxiv.org/abs/2312.03732) [QLoRA adapter](https://arxiv.org/abs/2305.14314) [QLoRA method](https://arxiv.org/abs/2305.14314). The resulting adapter was merged into the FP16 precision base model. The quantized version of the model was prepared using [llama.cpp](https://github.com/ggerganov/llama.cpp).
 ### Training Procedure
+The model was fine-tuned with a [rank-stabilized](https://arxiv.org/abs/2312.03732) [QLoRA adapter](https://arxiv.org/abs/2305.14314) on NVIDIA GeForce RTX 4060 Ti over the span of ~24 hours. Training was performed using [Unsloth AI](https://github.com/unslothai/unsloth) library on `Ubuntu 22.04.4 LTS` with `CUDA 12.1` and `Pytorch 2.3.0`.
+#### LoRa adapter configuration
+- Rank: 128
+- Alpha: 16
+- Dropout rate: 0.1
+- Target weights: `["q_proj", "k_proj", "o_proj", "gate_proj", "up_proj"]`,
+- `use_rslora=True`
+#### Training parameters
+- Max. sequence length: 4096 tokens
+- Samples per epoch: 3783
 - Number of epochs: 2
+- Learning rate: 1e-4
+- Warmup: 64 steps
+- LR Schedule: linear
+- Batch size: 1
+- Gradient accumulation steps: 1
 #### Preprocessing
 - titles, foreword, tags, and anything not comprising the text of the story was removed
 - non-ascii characters and character sequences serving as chapter separators were removed
+- any story mentioning underage personas in any context was removed from the dataset
 - names of private characters were replaced with randomized names across the dataset
 ## Environmental Impact
 Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
 - **Hardware Type:** NVIDIA GeForce RTX 4060 Ti
+- **Hours used:** 24
 - **Cloud Provider:** N/A
 - **Compute Region:** US-East
+- **Carbon Emitted:** 0.83 kg CO2 eq.