SnakyMcSnekFace commited on
Commit
f7e3c7d
1 Parent(s): 2b9780a

Update the model weights

Browse files
README.md CHANGED
@@ -28,52 +28,20 @@ prompt_template: >
28
 
29
  This model is a version of [KoboldAI/LLaMA2-13B-Psyfighter2](https://huggingface.co/KoboldAI/LLaMA2-13B-Psyfighter2) finetuned to better understand vore context. The primary purpose of this model is to be a storywriting assistant, a conversational model in a chat, and an interactive choose-your-own-adventure text game.
30
 
31
- The Adventure Mode is still work in progress, and will be the focus of the future updates.
32
 
33
- This is the FP16-precision version of the model for merging and fine-tuning. For using the model, download the quantized version here instead: [SnakyMcSnekFace/Psyfighter2-13B-vore-GGUF](https://huggingface.co/SnakyMcSnekFace/Psyfighter2-13B-vore-GGUF)
34
 
35
  ## Model Details
36
 
37
- ### Model Description
38
-
39
  The model behaves similarly to `KoboldAI/LLaMA2-13B-Psyfighter2`, which it was derived from. Please [see the README.md here](https://huggingface.co/KoboldAI/LLaMA2-13B-Psyfighter2/blob/main/README.md) to learn more.
40
 
41
- This model was fine-tuned on ~55 MiB of free-form text, containing stories focused around the vore theme. As a result, it has a strong vorny bias.
42
-
43
- ## How to Get Started with the Model
44
-
45
- The model can be used with any AI chatbots and front-ends designed to work with `.gguf` models. The model fits fully into 8GB VRAM, but can also run with degraded performance on smaller graphics cards.
46
-
47
- Similarly to the base model, the less prompt the model receives, the more creative is the output. For example, the writing assistant will generate an entire story when prompted with only 2-3 words.
48
-
49
- In the chat mode, if the conversation is not going where you would like it to go, edit the model's output and let it continue generation. The model will also match the style of the conversation.
50
-
51
- ### Koboldcpp Colab Notebook
52
-
53
- The easiest way to try out the model is [Koboldcpp Colab Notebook](https://colab.research.google.com/github/lostruins/koboldcpp/blob/concedo/colab.ipynb). This method doesn't require you to have a powerful graphics card.
54
-
55
- - Open the notebook
56
- - Paste the model URL into the field: `https://huggingface.co/SnakyMcSnekFace/Psyfighter2-13B-vore-GGUF/resolve/main/Psyfighter2-13B-vore.Q4_K_M.gguf`
57
- - Start the notebook, wait for the URL to CloudFlare tunnel to appear at the bottom and click it
58
- - Use the model as a writing assistant
59
- - Try an adventure from [https://aetherroom.club/](https://aetherroom.club/). Adventure mode is still work-in-progress, but it's getting there.
60
-
61
- ### Backyard AI
62
-
63
- Another convenient way to use the model is [Backyard AI](https://backyard.ai/) application, which allows you to run the model locally on your computer. You'll need a graphics card with at least 8GB VRAM to use the model comfortably.
64
-
65
- #### Download directly from HuggingFace (beta)
66
-
67
- In the left panel, click `Manage Models`, then select `Hugging face models`. Paste `https://huggingface.co/SnakyMcSnekFace/Psyfighter2-13B-vore-GGUF` into the text field and press `Fetch Models`. Click `Download` button to the next to the model format. Once the model is downloaded, you can select it in your character card or set it as a default model.
68
-
69
- #### Download manually
70
 
71
- Download the [Psyfighter2-13B-vore.Q4_K_M.gguf](https://huggingface.co/SnakyMcSnekFace/Psyfighter2-13B-vore-GGUF/resolve/main/Psyfighter2-13B-vore.Q4_K_M.gguf) file into `%appdata%\faraday\models` folder on your computer. The model should appear in `Manage Models` menu under `Downloaded Models`. You can then select it in your character card or set it as a default model.
72
-
73
- ### Model updates
74
-
75
- - 04/13/2024 - uploaded the first version of the model
76
  - 05/25/2024 - updated training process, making the model more coherent and improving the writing quality
 
 
77
 
78
  ## Bias, Risks, and Limitations
79
 
@@ -81,14 +49,15 @@ By design, this model has a strong vorny bias. It's not intended for use by anyo
81
 
82
  ## Training Details
83
 
84
- This model was fine-tuned on free-form text comprised of stories focused around the vore theme using [rank-stabilized](https://arxiv.org/abs/2312.03732) [QLoRA adapter](https://arxiv.org/abs/2305.14314) [QLoRA method](https://arxiv.org/abs/2305.14314). The resulting adapter was merged into the FP16 precision base model. The quantized version of the model was prepared using [llama.cpp](https://github.com/ggerganov/llama.cpp).
85
 
86
- ### Training Procedure
87
 
88
- The model was fine-tuned with a [rank-stabilized](https://arxiv.org/abs/2312.03732) [QLoRA adapter](https://arxiv.org/abs/2305.14314) on NVIDIA GeForce RTX 4060 Ti over the span of ~24 hours. Training was performed using [Unsloth AI](https://github.com/unslothai/unsloth) library on `Ubuntu 22.04.4 LTS` with `CUDA 12.1` and `Pytorch 2.3.0`.
89
 
 
90
 
91
- #### LoRa adapter configuration
92
 
93
  - Rank: 128
94
  - Alpha: 16
@@ -96,10 +65,24 @@ The model was fine-tuned with a [rank-stabilized](https://arxiv.org/abs/2312.037
96
  - Target weights: `["q_proj", "k_proj", "o_proj", "gate_proj", "up_proj"]`,
97
  - `use_rslora=True`
98
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
99
  #### Training parameters
100
 
101
  - Max. sequence length: 4096 tokens
102
- - Samples per epoch: 3783
103
  - Number of epochs: 2
104
  - Learning rate: 1e-4
105
  - Warmup: 64 steps
@@ -107,21 +90,11 @@ The model was fine-tuned with a [rank-stabilized](https://arxiv.org/abs/2312.037
107
  - Batch size: 1
108
  - Gradient accumulation steps: 1
109
 
110
- #### Preprocessing
111
-
112
- The stories in dataset were pre-processed as follows:
113
 
114
- - titles, foreword, tags, and anything not comprising the text of the story was removed
115
- - non-ascii characters and character sequences serving as chapter separators were removed
116
- - any story mentioning underage personas in any context was removed from the dataset
117
- - names of private characters were replaced with randomized names across the dataset
118
 
119
- ## Environmental Impact
120
 
121
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
122
 
123
- - **Hardware Type:** NVIDIA GeForce RTX 4060 Ti
124
- - **Hours used:** 24
125
- - **Cloud Provider:** N/A
126
- - **Compute Region:** US-East
127
- - **Carbon Emitted:** 0.83 kg CO2 eq.
 
28
 
29
  This model is a version of [KoboldAI/LLaMA2-13B-Psyfighter2](https://huggingface.co/KoboldAI/LLaMA2-13B-Psyfighter2) finetuned to better understand vore context. The primary purpose of this model is to be a storywriting assistant, a conversational model in a chat, and an interactive choose-your-own-adventure text game.
30
 
31
+ The Adventure Mode is still work in progress, and will be added later.
32
 
33
+ This is the FP16-precision version of the model for merging and fine-tuning. **For using the model, please see the quantized version and the instructions here: [SnakyMcSnekFace/Psyfighter2-13B-vore-GGUF](https://huggingface.co/SnakyMcSnekFace/Psyfighter2-13B-vore-GGUF)**
34
 
35
  ## Model Details
36
 
 
 
37
  The model behaves similarly to `KoboldAI/LLaMA2-13B-Psyfighter2`, which it was derived from. Please [see the README.md here](https://huggingface.co/KoboldAI/LLaMA2-13B-Psyfighter2/blob/main/README.md) to learn more.
38
 
39
+ ### Updates
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40
 
41
+ - 06/02/2024 - fixed errors in training and merging, significantly improving the overall prose quality
 
 
 
 
42
  - 05/25/2024 - updated training process, making the model more coherent and improving the writing quality
43
+ - 04/13/2024 - uploaded the first version of the model
44
+
45
 
46
  ## Bias, Risks, and Limitations
47
 
 
49
 
50
  ## Training Details
51
 
52
+ The model was fine-tuned using a [rank-stabilized](https://arxiv.org/abs/2312.03732) [QLoRA adapter](https://arxiv.org/abs/2305.14314). Training was performed using [Unsloth AI](https://github.com/unslothai/unsloth) library on `Ubuntu 22.04.4 LTS` with `CUDA 12.1` and `Pytorch 2.3.0`.
53
 
54
+ The total training time on NVIDIA GeForce RTX 4060 Ti is about 24 hours.
55
 
56
+ After training, the adapter weights were merged into the dequantized model as described in [ChrisHayduk's GitHub gist](https://gist.github.com/ChrisHayduk/1a53463331f52dca205e55982baf9930).
57
 
58
+ The quantized version of the model was prepared using [llama.cpp](https://github.com/ggerganov/llama.cpp).
59
 
60
+ ### LoRa adapter configuration
61
 
62
  - Rank: 128
63
  - Alpha: 16
 
65
  - Target weights: `["q_proj", "k_proj", "o_proj", "gate_proj", "up_proj"]`,
66
  - `use_rslora=True`
67
 
68
+
69
+ ### Domain adaptation
70
+
71
+ The initial training phase consists of fine-tuning the adapter on ~55 MiB of free-form text that containing stories focused around the vore theme. The text is broken into paragraphs, which are aggregated into training samples of 4096 tokens or less, without crossing the document boundary. Each sample starts with BOS token (with its `attention_mask` set to 0), and ends in EOS token. The paragraph breaks are normalized to always consist of two line breaks.
72
+
73
+ #### Dataset pre-processing
74
+
75
+ The raw-text stories in dataset were edited as follows:
76
+
77
+ - titles, foreword, tags, and anything not comprising the text of the story are removed
78
+ - non-ascii characters and chapter separators are removed
79
+ - stories mentioning underage personas in any context are deleted
80
+ - names of private characters are randomized
81
+
82
  #### Training parameters
83
 
84
  - Max. sequence length: 4096 tokens
85
+ - Samples per epoch: 5085
86
  - Number of epochs: 2
87
  - Learning rate: 1e-4
88
  - Warmup: 64 steps
 
90
  - Batch size: 1
91
  - Gradient accumulation steps: 1
92
 
 
 
 
93
 
94
+ ### Adventure mode SFT
 
 
 
95
 
96
+ TBD
97
 
98
+ ### Adventure mode KTO
99
 
100
+ TBD
 
 
 
 
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "../../../models/KoboldAI_LLaMA2-13B-Psyfighter2/",
3
  "architectures": [
4
  "LlamaForCausalLM"
5
  ],
 
1
  {
2
+ "_name_or_path": "/tmp/dequantized_Psyfighter2-13B/",
3
  "architectures": [
4
  "LlamaForCausalLM"
5
  ],
model-00001-of-00006.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e405b05644ace92bb9be26ecd41824a2a773a70fa19ee534c024c9e53dc3e8e6
3
  size 4978265728
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:433625fd1da3ae8163cbb623be665f1b7f40252f8b7a6bbc0d77f74a88459c60
3
  size 4978265728
model-00002-of-00006.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5b9bec5b5e9b28a6f68ed964381ab07017a6dcd06c1507bf802d98b23af55a23
3
  size 4970422160
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2468fd858ecf4ec3c290c93f02b20c32729022772cacad75350e339266d829aa
3
  size 4970422160
model-00003-of-00006.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3a77527d52ba11b1dbd01c627ca81c28e2dbab2914cf6d6bf4de265a855f1a3e
3
  size 4970422184
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ac2e69f709b2f8f55774095f524f6d2cca4d0842aaf8d76e202cfc263c428f38
3
  size 4970422184
model-00004-of-00006.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:971e8f443e04c074ae416fbccceb49749608247e3893c2805bdc072b16bed550
3
  size 4933701432
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f12b3071b6006608aa1dac69ff465dbbb9d2ed1fb9989d02d77aece0e4402d14
3
  size 4933701432
model-00005-of-00006.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d05c74b09dc9e6f54fd13bec21f19704dfbb126484cab9115a3ca95ce3cd38e3
3
  size 4933722144
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:50780887ba3c2602c8fc3ffb58c43c6144d42442bef75e71d33842daa0b117a9
3
  size 4933722144
model-00006-of-00006.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:85653cec6b1526afd2de089f710528bbd3e141ed91a9f3526816b05700a4d121
3
  size 1245236904
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ca60917d349df6204c8e6e4f890beacd837d1ad786190b8cef9f5d3266556c27
3
  size 1245236904