metadata

license: apache-2.0
tags:
  - moe
  - merge
  - mergekit
  - vicgalle/CarbonBeagle-11B
  - Sao10K/Fimbulvetr-10.7B-v1
  - bn22/Nous-Hermes-2-SOLAR-10.7B-MISALIGNED
  - Yhyu13/LMCocktail-10.7B-v1

Umbra-v2.1-MoE-4x10.7

Umbra is an off shoot of the [Lumosia Series] with a Focus in General Knowledge and RP/ERP

Umbra v2.1 has updated models and a set of revamped positive and negative prompts.

This model was built around the idea someone wanted a General Assiatant that could also tell Stories/RP/ERP when wanted.

This is a very experimental model. It's a combination MoE of Solar models, the models selected are personal favorites.

base context is 4k but it stays coherent up to 16k

Please let me know how the model works for you.

A Umbra Personality tavern card has been added to the files.

Update: Umbra-v2 had a token error fixed with Umbra-v2.1

### System:

### USER:{prompt}

### Assistant:

Settings:

Temp: 1.0
min-p: 0.02-0.1

Evals:

posted soon:

Avg:
ARC:
HellaSwag:
MMLU:
T-QA:
Winogrande:
GSM8K:

Examples:

posted soon

posted soon

🧩 Configuration

base_model: vicgalle/CarbonBeagle-11B
gate_mode: hidden
dtype: bfloat16
experts:
  - source_model: vicgalle/CarbonBeagle-11B
    positive_prompts: [Revamped]

  - source_model: Sao10K/Fimbulvetr-10.7B-v1
    positive_prompts: [Revamped]

  - source_model: bn22/Nous-Hermes-2-SOLAR-10.7B-MISALIGNED
    positive_prompts: [Revamped]

  - source_model: Yhyu13/LMCocktail-10.7B-v1
    positive_prompts: [Revamed]

Umbra-v2-MoE-4x10.7 is a Mixure of Experts (MoE) made with the following models:
* [vicgalle/CarbonBeagle-11B](https://huggingface.co/vicgalle/CarbonBeagle-11B)
* [Sao10K/Fimbulvetr-10.7B-v1](https://huggingface.co/Sao10K/Fimbulvetr-10.7B-v1)
* [bn22/Nous-Hermes-2-SOLAR-10.7B-MISALIGNED](https://huggingface.co/bn22/Nous-Hermes-2-SOLAR-10.7B-MISALIGNED)
* [Yhyu13/LMCocktail-10.7B-v1](https://huggingface.co/Yhyu13/LMCocktail-10.7B-v1)

💻 Usage

!pip install -qU transformers bitsandbytes accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "Steelskull/Umbra-v2-MoE-4x10.7"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True},
)

messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])