|
--- |
|
license: apache-2.0 |
|
tags: |
|
- merge |
|
- mergekit |
|
- lazymergekit |
|
- Hermes3 |
|
- SuperNovaLite |
|
- Purosani |
|
- Llama3.1 |
|
- kotyKD/Llama3.1-Hermes3-SuperNovaLite-merged-with-base-8B |
|
- djuna/L3.1-Purosani-2-8B |
|
- instruction-following |
|
- long-form-generation |
|
- roleplay |
|
- storytelling |
|
base_model: |
|
- djuna/L3.1-Purosani-2-8B |
|
--- |
|
|
|
# ZeroXClem/Llama3.1-Hermes3-SuperNova-8B-L3.1-Purosani-2-8B |
|
|
|
**ZeroXClem/Llama3.1-Hermes3-SuperNova-8B-L3.1-Purosani-2-8B** is a cutting-edge merged model that blends the best features of two highly optimized architectures to create an **advanced**, **adaptive**, and **powerful** model. Whether for scientific research, complex instruction-following, or immersive roleplay scenarios, this model excels at every task itβs thrown into. |
|
|
|
## π Family Tree |
|
|
|
This model is a merger of the following: |
|
|
|
- [**kotyKD/Llama3.1-Hermes3-SuperNovaLite-merged-with-base-8B**](https://huggingface.co/kotyKD/Llama3.1-Hermes3-SuperNovaLite-merged-with-base-8B) |
|
- [**djuna/L3.1-Purosani-2-8B**](https://huggingface.co/djuna/L3.1-Purosani-2-8B) |
|
|
|
These parent models are themselves the result of **complex merges** of various high-performance models, making ZeroXClem/Llama3.1-Hermes3-SuperNova-8B-L3.1-Purosani-2-8B a **super hybrid** capable of handling diverse tasks with efficiency and finesse. |
|
|
|
## π³ Model Family Genealogy |
|
|
|
[View the ZeroXClem/Llama3.1-Hermes3-SuperNova-8B-L3.1-Purosani-2-8B Model Family Genealogy](https://imgur.com/a/oXMwVAj) |
|
|
|
This image represents the complex lineage of our model, showcasing its rich heritage and the diverse range of capabilities it inherits from its ancestors. |
|
|
|
|
|
|
|
## 𧬠Detailed Model Lineage |
|
|
|
### **A: kotyKD/Llama3.1-Hermes3-SuperNovaLite-merged-with-base-8B** |
|
|
|
Merged using the **TIES merge method**, this model utilizes **unsloth/Meta-Llama-3.1-8B** as its base, combining: |
|
|
|
- **arcee-ai/Llama-3.1-SuperNova-Lite**: A distilled 8B parameter version of the **Llama-3.1-405B-Instruct** model, designed to maintain high performance while minimizing resource consumption. Its training, via **EvolKit**, offers instruction-following precision and domain-specific adaptability. |
|
- **NousResearch/Hermes-3-Llama-3.1-8B**: Known for its robustness, this model enhances long-range contextual understanding, making it ideal for complex, multi-layered tasks. |
|
|
|
### **B: djuna/L3.1-Purosani-2-8B** |
|
|
|
This merge incorporates: |
|
|
|
- **hf-100/Llama-3-Spellbound-Instruct-8B-0.3** |
|
- **arcee-ai/Llama-3.1-SuperNova-Lite** |
|
- **grimjim/Llama-3-Instruct-abliteration-LoRA-8B** |
|
- **THUDM/LongWriter-llama3.1-8B**, capable of generating over **10,000 words** in one pass, making it perfect for long-form content generation. |
|
|
|
Further contributors include **ResplendentAI/Smarts_Llama3** and **djuna/L3.1-Suze-Vume-2-calc**, making this model highly adaptable to a broad range of applications. |
|
|
|
## π οΈ Merge Details |
|
|
|
The model was merged using the **della merge method** with **kromeurus/L3.1-Aglow-Vulca-v0.1-8B** as the base. This method, combined with the following models, ensures both **precision** and **adaptability**: |
|
|
|
- **djuna/L3.1-Noraian** |
|
- **Casual-Autopsy/L3-Super-Nova-RP-8B** |
|
- **TheDrummer/Llama-3SOME-8B-v2** |
|
- **djuna/L3.1-ForStHS** |
|
- **Blackroot/Llama-3-8B-Abomination-LORA** |
|
|
|
## π§ Technical Configuration |
|
|
|
The merging process used advanced methods to ensure smooth integration and consistent performance across various tasks: |
|
|
|
```yaml |
|
slices: |
|
- sources: |
|
- model: kotyKD/Llama3.1-Hermes3-SuperNovaLite-merged-with-base-8B |
|
layer_range: [0, 32] |
|
- model: djuna/L3.1-Purosani-2-8B |
|
layer_range: [0, 32] |
|
merge_method: slerp |
|
base_model: djuna/L3.1-Purosani-2-8B |
|
parameters: |
|
t: |
|
- filter: self_attn |
|
value: [0, 0.5, 0.3, 0.7, 1] |
|
- filter: mlp |
|
value: [1, 0.5, 0.7, 0.3, 0] |
|
- value: 0.5 |
|
dtype: bfloat16 |
|
|
|
``` |
|
|
|
## π― Extended Support for Roleplay & Immersive Storytelling |
|
|
|
**ZeroXClem/Llama3.1-Hermes3-SuperNova-8B-L3.1-Purosani-2-8B** has been optimized for **extended roleplay support**, making it an exceptional choice for **interactive storytelling** and **deep character development**. With its ability to understand long-form context and generate cohesive responses over extensive interactions, this model excels in: |
|
|
|
- **Character-driven interactions**: Develop rich, nuanced personalities that respond in believable and engaging ways. |
|
- **World-building & Lore creation**: Create vast, interconnected universes with intricate lore, all generated in real-time. |
|
- **Dynamic NPC dialogues**: Use the model to generate complex, reactive conversations for game NPCs, offering a fluid, immersive experience for players. |
|
|
|
## π Key Features & Capabilities |
|
|
|
### **Advanced Roleplay and Long-Form Content Generation** |
|
|
|
With models like **THUDM/LongWriter-llama3.1-8B** contributing their expertise, this model is perfect for generating **long-form narratives** while maintaining coherence and creativity. |
|
|
|
### **Instruction Following & Task Adaptability** |
|
|
|
Combining the capabilities of **Hermes** and **SuperNovaLite**, this model can efficiently follow detailed instructions, making it ideal for: |
|
|
|
- **Task automation** |
|
- **Virtual assistants** |
|
- **Research generation** |
|
|
|
### **Efficiency Without Compromise** |
|
|
|
Distilled models like **SuperNovaLite** ensure that this model delivers high performance without the extensive resource requirements of larger models. |
|
|
|
## π― Use Case & Applications |
|
|
|
- **Roleplay & Interactive Storytelling**: The perfect companion for storytellers, RPG enthusiasts, and game developers. Whether crafting dynamic NPC interactions or generating deep, immersive worlds, this model can handle it all. |
|
- **Instruction-based AI**: With enhanced instruction-following abilities, this model is ideal for developing intelligent assistants or chatbots that require high accuracy and quick adaptability. |
|
- **Long-Form Writing**: From novels to research papers, this model can generate lengthy, well-structured content with ease, thanks to its extensive training on long-form data. |
|
|
|
## π License |
|
|
|
This model is open-sourced under the **Apache-2.0 License**, allowing others to use and modify it freely, as long as they give proper attribution. |
|
|
|
## π‘ Tags |
|
|
|
- `merge` |
|
- `mergekit` |
|
- `Hermes3` |
|
- `SuperNovaLite` |
|
- `Purosani` |
|
- `Llama3.1` |
|
- `instruction-following` |
|
- `long-form-generation` |
|
- `roleplay` |
|
- `storytelling` |