ZeroXClem's picture
Update README.md
6e0a40d verified
metadata
license: apache-2.0
tags:
  - merge
  - mergekit
  - lazymergekit
  - Hermes3
  - SuperNovaLite
  - Purosani
  - Llama3.1
  - kotyKD/Llama3.1-Hermes3-SuperNovaLite-merged-with-base-8B
  - djuna/L3.1-Purosani-2-8B
  - instruction-following
  - long-form-generation
  - roleplay
  - storytelling
base_model:
  - djuna/L3.1-Purosani-2-8B

ZeroXClem/Llama3.1-Hermes3-SuperNova-8B-L3.1-Purosani-2-8B

ZeroXClem/Llama3.1-Hermes3-SuperNova-8B-L3.1-Purosani-2-8B is a cutting-edge merged model that blends the best features of two highly optimized architectures to create an advanced, adaptive, and powerful model. Whether for scientific research, complex instruction-following, or immersive roleplay scenarios, this model excels at every task it’s thrown into.

🌟 Family Tree

This model is a merger of the following:

These parent models are themselves the result of complex merges of various high-performance models, making ZeroXClem/Llama3.1-Hermes3-SuperNova-8B-L3.1-Purosani-2-8B a super hybrid capable of handling diverse tasks with efficiency and finesse.

🌳 Model Family Genealogy

View the ZeroXClem/Llama3.1-Hermes3-SuperNova-8B-L3.1-Purosani-2-8B Model Family Genealogy

This image represents the complex lineage of our model, showcasing its rich heritage and the diverse range of capabilities it inherits from its ancestors.

🧬 Detailed Model Lineage

A: kotyKD/Llama3.1-Hermes3-SuperNovaLite-merged-with-base-8B

Merged using the TIES merge method, this model utilizes unsloth/Meta-Llama-3.1-8B as its base, combining:

  • arcee-ai/Llama-3.1-SuperNova-Lite: A distilled 8B parameter version of the Llama-3.1-405B-Instruct model, designed to maintain high performance while minimizing resource consumption. Its training, via EvolKit, offers instruction-following precision and domain-specific adaptability.
  • NousResearch/Hermes-3-Llama-3.1-8B: Known for its robustness, this model enhances long-range contextual understanding, making it ideal for complex, multi-layered tasks.

B: djuna/L3.1-Purosani-2-8B

This merge incorporates:

  • hf-100/Llama-3-Spellbound-Instruct-8B-0.3
  • arcee-ai/Llama-3.1-SuperNova-Lite
  • grimjim/Llama-3-Instruct-abliteration-LoRA-8B
  • THUDM/LongWriter-llama3.1-8B, capable of generating over 10,000 words in one pass, making it perfect for long-form content generation.

Further contributors include ResplendentAI/Smarts_Llama3 and djuna/L3.1-Suze-Vume-2-calc, making this model highly adaptable to a broad range of applications.

πŸ› οΈ Merge Details

The model was merged using the della merge method with kromeurus/L3.1-Aglow-Vulca-v0.1-8B as the base. This method, combined with the following models, ensures both precision and adaptability:

  • djuna/L3.1-Noraian
  • Casual-Autopsy/L3-Super-Nova-RP-8B
  • TheDrummer/Llama-3SOME-8B-v2
  • djuna/L3.1-ForStHS
  • Blackroot/Llama-3-8B-Abomination-LORA

πŸ”§ Technical Configuration

The merging process used advanced methods to ensure smooth integration and consistent performance across various tasks:

slices:
  - sources:
      - model: kotyKD/Llama3.1-Hermes3-SuperNovaLite-merged-with-base-8B
        layer_range: [0, 32]
      - model: djuna/L3.1-Purosani-2-8B
        layer_range: [0, 32]
merge_method: slerp
base_model: djuna/L3.1-Purosani-2-8B
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5
dtype: bfloat16

🎯 Extended Support for Roleplay & Immersive Storytelling

ZeroXClem/Llama3.1-Hermes3-SuperNova-8B-L3.1-Purosani-2-8B has been optimized for extended roleplay support, making it an exceptional choice for interactive storytelling and deep character development. With its ability to understand long-form context and generate cohesive responses over extensive interactions, this model excels in:

  • Character-driven interactions: Develop rich, nuanced personalities that respond in believable and engaging ways.
  • World-building & Lore creation: Create vast, interconnected universes with intricate lore, all generated in real-time.
  • Dynamic NPC dialogues: Use the model to generate complex, reactive conversations for game NPCs, offering a fluid, immersive experience for players.

πŸš€ Key Features & Capabilities

Advanced Roleplay and Long-Form Content Generation

With models like THUDM/LongWriter-llama3.1-8B contributing their expertise, this model is perfect for generating long-form narratives while maintaining coherence and creativity.

Instruction Following & Task Adaptability

Combining the capabilities of Hermes and SuperNovaLite, this model can efficiently follow detailed instructions, making it ideal for:

  • Task automation
  • Virtual assistants
  • Research generation

Efficiency Without Compromise

Distilled models like SuperNovaLite ensure that this model delivers high performance without the extensive resource requirements of larger models.

🎯 Use Case & Applications

  • Roleplay & Interactive Storytelling: The perfect companion for storytellers, RPG enthusiasts, and game developers. Whether crafting dynamic NPC interactions or generating deep, immersive worlds, this model can handle it all.
  • Instruction-based AI: With enhanced instruction-following abilities, this model is ideal for developing intelligent assistants or chatbots that require high accuracy and quick adaptability.
  • Long-Form Writing: From novels to research papers, this model can generate lengthy, well-structured content with ease, thanks to its extensive training on long-form data.

πŸ“œ License

This model is open-sourced under the Apache-2.0 License, allowing others to use and modify it freely, as long as they give proper attribution.

πŸ’‘ Tags

  • merge
  • mergekit
  • Hermes3
  • SuperNovaLite
  • Purosani
  • Llama3.1
  • instruction-following
  • long-form-generation
  • roleplay
  • storytelling