ProphetOfBostrom
/

opus-v1-34b-4b8h-8192l-EXL2

 license: other
 license_name: yi-license
 license_link: https://huggingface.co/01-ai/Yi-34B/blob/main/LICENSE
+language:
+- en
+pipeline_tag: text-generation
+tags:
+- unsloth
+- axolotl
+- exllamav2
+- exl2
+- 4bit
+library_name: transformers
 ---
+## quantized with the default exl2 dataset with sequence lengths of 8192 and 400 calibration lines instead of 100. possibly microwaved, presumably better.
+## tokenizer works. tokenizer.model is not required and may (is 100%) actually missing the working tokens.
+# DreamGen Opus V1
+<div style="display: flex; flex-direction: row; align-items: center;">
+<img src="/dreamgen/opus-v1-34b/resolve/main/images/logo-1024.png" alt="model logo" style="
+    border-radius: 12px;
+    margin-right: 12px;
+    margin-top: 0px;
+    margin-bottom: 0px;
+    max-width: 100px;
+    height: auto;
+"/>
+Models for **(steerable) story-writing and role-playing**.
+<br/>[All Opus V1 models, including quants](https://huggingface.co/collections/dreamgen/opus-v1-65d092a6f8ab7fc669111b31).
+</div>
+## Resources
+- [**Opus V1 prompting guide**](https://dreamgen.com/docs/models/opus/v1) with many (interactive) examples and prompts that you can copy.
+- [**Google Colab**](https://colab.research.google.com/drive/1J178fH6IdQOXNi-Njgdacf5QgAxsdT20?usp=sharing) for interactive role-play using `opus-v1.2-7b`.
+- [Python code](example/prompt/format.py) to format the prompt correctly.
+<img src="/dreamgen/opus-v1-34b/resolve/main/images/story_writing.webp" alt="story writing on dreamgen.com" style="
+    padding: 12px;
+    border-radius: 12px;
+    border: 2px solid #f9a8d4;
+    background: rgb(9, 9, 11);
+"/>
+## Prompting
+<details>
+<summary>The models use an extended version of ChatML.</summary>
+```
+<|im_start|>system
+(Story description in the right format here)
+(Typically consists of plot description, style description and characters)<|im_end|>
+<|im_start|>user
+(Your instruction on how the story should continue)<|im_end|>
+<|im_start|>text names= Alice
+(Continuation of the story from the Alice character)<|im_end|>
+<|im_start|>text
+(Continuation of the story from no character in particular (pure narration))<|im_end|>
+<|im_start|>user
+(Your instruction on how the story should continue)<|im_end|>
+<|im_start|>text names= Bob
+(Continuation of the story from the Bob character)<|im_end|>
+```
+The Opus V1 extension is the addition of the `text` role, and the addition / modification of role names.
+Pay attention to the following:
+- The `text` messages can (but don't have to have) `names`, names are used to indicate the "active" character during role-play.
+- There can be multiple subsequent message with a `text` role, especially if names are involved.
+- There can be multiple names attached to a message.
+- The format for names is `names= {{name[0]}}; {{name[1]}}`, beware of the spaces after `names=` and after the `;`. This spacing leads to most natural tokenization for the names.
+</details>
+While the main goal for the models is great story-writing and role-playing performance, the models are also capable of several writing related tasks as well as general assistance.
+Here's how you can prompt the model for the following tasks
+- Steerable [Story-writing](https://dreamgen.com/docs/models/opus/v1#task-story-writing) and [Role-playing](https://dreamgen.com/docs/models/opus/v1#task-role-playing):
+  - Input:
+    - System prompt: You provide story / role-play description, which consists of:
+      - Plot description
+      - Style description
+      - Characters and their descriptions
+    - Conversation turns:
+      - Text / message turn: This represents part of the story or role play
+      - Instruction: This tells the model what should happen next
+  - Output: Continuation of the story / role-play.
+- [Story plot summarization](https://dreamgen.com/docs/models/opus/v1#task-plot-description)
+  - Input: A story, or a few chapters of a story.
+  - Output: A description of the story or chapters.
+- [Story character description](https://dreamgen.com/docs/models/opus/v1#task-char-description)
+  - Input: A story, or a few chapters of a story, set of characters.
+  - Output: A description of the characters.
+- [Story style description](https://dreamgen.com/docs/models/opus/v1#task-style-description)
+  - Input: A story, or a few chapters of a story.
+  - Output: A description the style of the story.
+- [Story description to chapters](https://dreamgen.com/docs/models/opus/v1#task-story-description-to-chapter-descriptions)
+  - Input: A brief plot description and the desired number of chapters.
+  - Output: A description for each chapter.
+- And more...
+### Sampling params
+For story-writing and role-play, I recommend "Min P" based sampling with `min_p` in the range `[0.01, 0.1]` and with `temperature` in the range `[0.5, 1.5]`, depending on your preferences. A good starting point would be `min_p=0.1; temperature=0.8`.
+You may also benefit from setting presence, frequency and repetition penalties, especially at lower temperatures.
+## Dataset
+The fine-tuning dataset consisted of ~100M tokens of steerable story-writing, role-playing, writing-assistant and general-assistant examples. Each example was up to 31000 tokens long.
+All story-writing and role-playing examples were based on human-written text.
+![token count distribution](images/token_count_cum__token_bucket.png)
+## Running the model
+The model is should be compatible with any software that supports the base model, but beware of prompting and tokenization.
+I recommend using these model versions:
+- 7B: [no quant (opus-v1.2-7b)](https://huggingface.co/dreamgen/opus-v1.2-7b)
+- 34B: [no quant (opus-v1-34b)](https://huggingface.co/dreamgen/opus-v1-34b) or [awq (opus-v1-34b-awq)](https://huggingface.co/dreamgen/opus-v1-34b-awq)
+### Running on DreamGen.com (free)
+You can try the model for free on [dreamgen.com](https://dreamgen.com) — note that an account is required.
+### Running Locally
+- **Make sure your prompt is as close as possible to the Opus V1**
+  - Regardless of which backend you use, it's important that you format your prompt well and that the tokenization works correctly.
+  - [Read the prompt guide](https://dreamgen.com/docs/models/opus/v1)
+  - [Read the prompt formatting code](example/prompt/format.py)
+  - Make sure `<|im_start|>` and `<|im_end|>` are tokenized correctly
+- **vLLM**
+  - [**Google Colab**](https://colab.research.google.com/drive/1J178fH6IdQOXNi-Njgdacf5QgAxsdT20?usp=sharing): This is a simple interactive Google Colab to do role-play with the 7B model, it should fit on the T4 GPU.
+  - [Code](example/prompt/interactive.py): This is simple script for interactive chat for one hard-coded scenario.
+- **SillyTavern**
+  - [Settings](https://huggingface.co/{{REPO_ID}}/tree/main/configs/silly_tavern), v2 kindly provided by @MarinaraSpaghetti
+  - [Settings screenshot](configs/silly_tavern/settings_screenshot.webp)
+  - This is just an attempt at approximating the Opus V1 prompt, it won't be perfect
+- **LM Studio**
+  - [Config](configs/lmstudio/preset.json)
+  - Just like ChatML, just changed "assistant" to "text" role.
+- **HuggingFace**
+  - [Chat template](tokenizer_config.json#L51)
+  - Just like ChatML, just changed "assistant" to "text" role.
+## Known Issues
+- **34B tokenization**:
+  - There seems to be a mismatch between the tokenizer of the base and fine-tuned model. It's unclear whether this also affected training, or whether it's just incorrectly saved tokenizer (you can see `tokenizer.json` was not saved ([bug report](https://github.com/OpenAccess-AI-Collective/axolotl/issues/1322))).
+  - This affects BOS and EOS (which aren't really used by Yi) and the tokenization of the first input token.
+  - Overall impact should be minor.
+- **34B repetition**:
+  - The 34B sometimes gets stuck repeating the same word, or synonyms. This seems to be a common problem across various Yi 34B fine-tunes.
+- **GGUF**:
+  - The conversion might be messed up and in my tests even `Q_8` of the `opus-v1.2-7b` is much worse than the `fp16` version.
+- **Ooba**:
+  - The tokenization might be messed up. Some users reported that `<|im_start|>` and `<|im_end|>` are tokenized as multiple tokens.
+## Community
+Join the DreamGen community on [**Discord**](https://dreamgen.com/discord) to get early access to new models.
+## License
+- This model is intended for personal use only, other use is not permitted.
+---
+# DreamGen Opus V1
+<div style="display: flex; flex-direction: row; align-items: center;">
+<img src="/dreamgen/opus-v1-34b/resolve/main/images/logo-1024.png" alt="model logo" style="
+    border-radius: 12px;
+    margin-right: 12px;
+    margin-top: 0px;
+    margin-bottom: 0px;
+    max-width: 100px;
+    height: auto;
+"/>
+Models for **(steerable) story-writing and role-playing**.
+<br/>[All Opus V1 models, including quants](https://huggingface.co/collections/dreamgen/opus-v1-65d092a6f8ab7fc669111b31).
+</div>
+## Resources
+- [**Opus V1 prompting guide**](https://dreamgen.com/docs/models/opus/v1) with many (interactive) examples and prompts that you can copy.
+- [**Google Colab**](https://colab.research.google.com/drive/1J178fH6IdQOXNi-Njgdacf5QgAxsdT20?usp=sharing) for interactive role-play using `opus-v1.2-7b`.
+- [Python code](example/prompt/format.py) to format the prompt correctly.
+<img src="/dreamgen/opus-v1-34b/resolve/main/images/story_writing.webp" alt="story writing on dreamgen.com" style="
+    padding: 12px;
+    border-radius: 12px;
+    border: 2px solid #f9a8d4;
+    background: rgb(9, 9, 11);
+"/>
+## Prompting
+<details>
+<summary>The models use an extended version of ChatML.</summary>
+```
+<|im_start|>system
+(Story description in the right format here)
+(Typically consists of plot description, style description and characters)<|im_end|>
+<|im_start|>user
+(Your instruction on how the story should continue)<|im_end|>
+<|im_start|>text names= Alice
+(Continuation of the story from the Alice character)<|im_end|>
+<|im_start|>text
+(Continuation of the story from no character in particular (pure narration))<|im_end|>
+<|im_start|>user
+(Your instruction on how the story should continue)<|im_end|>
+<|im_start|>text names= Bob
+(Continuation of the story from the Bob character)<|im_end|>
+```
+The Opus V1 extension is the addition of the `text` role, and the addition / modification of role names.
+Pay attention to the following:
+- The `text` messages can (but don't have to have) `names`, names are used to indicate the "active" character during role-play.
+- There can be multiple subsequent message with a `text` role, especially if names are involved.
+- There can be multiple names attached to a message.
+- The format for names is `names= {{name[0]}}; {{name[1]}}`, beware of the spaces after `names=` and after the `;`. This spacing leads to most natural tokenization for the names.
+</details>
+While the main goal for the models is great story-writing and role-playing performance, the models are also capable of several writing related tasks as well as general assistance.
+Here's how you can prompt the model for the following tasks
+- Steerable [Story-writing](https://dreamgen.com/docs/models/opus/v1#task-story-writing) and [Role-playing](https://dreamgen.com/docs/models/opus/v1#task-role-playing):
+  - Input:
+    - System prompt: You provide story / role-play description, which consists of:
+      - Plot description
+      - Style description
+      - Characters and their descriptions
+    - Conversation turns:
+      - Text / message turn: This represents part of the story or role play
+      - Instruction: This tells the model what should happen next
+  - Output: Continuation of the story / role-play.
+- [Story plot summarization](https://dreamgen.com/docs/models/opus/v1#task-plot-description)
+  - Input: A story, or a few chapters of a story.
+  - Output: A description of the story or chapters.
+- [Story character description](https://dreamgen.com/docs/models/opus/v1#task-char-description)
+  - Input: A story, or a few chapters of a story, set of characters.
+  - Output: A description of the characters.
+- [Story style description](https://dreamgen.com/docs/models/opus/v1#task-style-description)
+  - Input: A story, or a few chapters of a story.
+  - Output: A description the style of the story.
+- [Story description to chapters](https://dreamgen.com/docs/models/opus/v1#task-story-description-to-chapter-descriptions)
+  - Input: A brief plot description and the desired number of chapters.
+  - Output: A description for each chapter.
+- And more...
+### Sampling params
+For story-writing and role-play, I recommend "Min P" based sampling with `min_p` in the range `[0.01, 0.1]` and with `temperature` in the range `[0.5, 1.5]`, depending on your preferences. A good starting point would be `min_p=0.1; temperature=0.8`.
+You may also benefit from setting presence, frequency and repetition penalties, especially at lower temperatures.
+## Dataset
+The fine-tuning dataset consisted of ~100M tokens of steerable story-writing, role-playing, writing-assistant and general-assistant examples. Each example was up to 31000 tokens long.
+All story-writing and role-playing examples were based on human-written text.
+![token count distribution](images/token_count_cum__token_bucket.png)
+## Running the model
+The model is should be compatible with any software that supports the base model, but beware of prompting and tokenization.
+I recommend using these model versions:
+- 7B: [no quant (opus-v1.2-7b)](https://huggingface.co/dreamgen/opus-v1.2-7b)
+- 34B: [no quant (opus-v1-34b)](https://huggingface.co/dreamgen/opus-v1-34b) or [awq (opus-v1-34b-awq)](https://huggingface.co/dreamgen/opus-v1-34b-awq)
+### Running on DreamGen.com (free)
+You can try the model for free on [dreamgen.com](https://dreamgen.com) — note that an account is required.
+### Running Locally
+- **Make sure your prompt is as close as possible to the Opus V1**
+  - Regardless of which backend you use, it's important that you format your prompt well and that the tokenization works correctly.
+  - [Read the prompt guide](https://dreamgen.com/docs/models/opus/v1)
+  - [Read the prompt formatting code](example/prompt/format.py)
+  - Make sure `<|im_start|>` and `<|im_end|>` are tokenized correctly
+- **vLLM**
+  - [**Google Colab**](https://colab.research.google.com/drive/1J178fH6IdQOXNi-Njgdacf5QgAxsdT20?usp=sharing): This is a simple interactive Google Colab to do role-play with the 7B model, it should fit on the T4 GPU.
+  - [Code](example/prompt/interactive.py): This is simple script for interactive chat for one hard-coded scenario.
+- **SillyTavern**
+  - [Settings](https://huggingface.co/{{REPO_ID}}/tree/main/configs/silly_tavern), v2 kindly provided by @MarinaraSpaghetti
+  - [Settings screenshot](configs/silly_tavern/settings_screenshot.webp)
+  - This is just an attempt at approximating the Opus V1 prompt, it won't be perfect
+- **LM Studio**
+  - [Config](configs/lmstudio/preset.json)
+  - Just like ChatML, just changed "assistant" to "text" role.
+- **HuggingFace**
+  - [Chat template](tokenizer_config.json#L51)
+  - Just like ChatML, just changed "assistant" to "text" role.
+## Known Issues
+- **34B tokenization**:
+  - There seems to be a mismatch between the tokenizer of the base and fine-tuned model. It's unclear whether this also affected training, or whether it's just incorrectly saved tokenizer (you can see `tokenizer.json` was not saved ([bug report](https://github.com/OpenAccess-AI-Collective/axolotl/issues/1322))).
+  - This affects BOS and EOS (which aren't really used by Yi) and the tokenization of the first input token.
+  - Overall impact should be minor.
+- **34B repetition**:
+  - The 34B sometimes gets stuck repeating the same word, or synonyms. This seems to be a common problem across various Yi 34B fine-tunes.
+- **GGUF**:
+  - The conversion might be messed up and in my tests even `Q_8` of the `opus-v1.2-7b` is much worse than the `fp16` version.
+- **Ooba**:
+  - The tokenization might be messed up. Some users reported that `<|im_start|>` and `<|im_end|>` are tokenized as multiple tokens.
+## Community
+Join the DreamGen community on [**Discord**](https://dreamgen.com/discord) to get early access to new models.
+## License
+- This model is intended for personal use only, other use is not permitted.