SteelStorage
/

Umbra-v2.1-MoE-4x10.7

Text Generation

Mixture of Experts

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Umbra-v2.1-MoE-4x10.7 / README.md

Steelskull's picture

Update README.md

8b0f2ad verified 10 months ago

|

2.86 kB

	---
	license: apache-2.0
	tags:
	- moe
	- merge
	- mergekit
	- vicgalle/CarbonBeagle-11B
	- Sao10K/Fimbulvetr-10.7B-v1
	- bn22/Nous-Hermes-2-SOLAR-10.7B-MISALIGNED
	- Yhyu13/LMCocktail-10.7B-v1
	---

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/hen3fNHRD7BCPvd2KkfjZ.png)

	# Umbra-v2.1-MoE-4x10.7

	Umbra is an off shoot of the [Lumosia Series] with a Focus in General Knowledge and RP/ERP

	Umbra v2.1 has updated models and a set of revamped positive and negative prompts.

	This model was built around the idea someone wanted a General Assiatant that could also tell Stories/RP/ERP when wanted.

	This is a very experimental model. It's a combination MoE of Solar models, the models selected are personal favorites.

	base context is 4k but it stays coherent up to 16k

	Please let me know how the model works for you.

	A Umbra Personality tavern card has been added to the files.

	Update:
	Umbra-v2 had a token error fixed with Umbra-v2.1


	```
	### System:

	### USER:{prompt}

	### Assistant:
	```

	Settings:
	```
	Temp: 1.0
	min-p: 0.02-0.1
	```

	## Evals:

	posted soon:

	* Avg:
	* ARC:
	* HellaSwag:
	* MMLU:
	* T-QA:
	* Winogrande:
	* GSM8K:

	## Examples:
	```
	posted soon
	```
	```
	posted soon
	```

	## 🧩 Configuration

	```
	base_model: vicgalle/CarbonBeagle-11B
	gate_mode: hidden
	dtype: bfloat16
	experts:
	- source_model: vicgalle/CarbonBeagle-11B
	positive_prompts: [Revamped]

	- source_model: Sao10K/Fimbulvetr-10.7B-v1
	positive_prompts: [Revamped]

	- source_model: bn22/Nous-Hermes-2-SOLAR-10.7B-MISALIGNED
	positive_prompts: [Revamped]

	- source_model: Yhyu13/LMCocktail-10.7B-v1
	positive_prompts: [Revamed]
	```
	```
	Umbra-v2-MoE-4x10.7 is a Mixure of Experts (MoE) made with the following models:
	* [vicgalle/CarbonBeagle-11B](https://huggingface.co/vicgalle/CarbonBeagle-11B)
	* [Sao10K/Fimbulvetr-10.7B-v1](https://huggingface.co/Sao10K/Fimbulvetr-10.7B-v1)
	* [bn22/Nous-Hermes-2-SOLAR-10.7B-MISALIGNED](https://huggingface.co/bn22/Nous-Hermes-2-SOLAR-10.7B-MISALIGNED)
	* [Yhyu13/LMCocktail-10.7B-v1](https://huggingface.co/Yhyu13/LMCocktail-10.7B-v1)

	```

	## 💻 Usage

	```python
	!pip install -qU transformers bitsandbytes accelerate

	from transformers import AutoTokenizer
	import transformers
	import torch

	model = "Steelskull/Umbra-v2-MoE-4x10.7"

	tokenizer = AutoTokenizer.from_pretrained(model)
	pipeline = transformers.pipeline(
	"text-generation",
	model=model,
	model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True},
	)

	messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
	prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
	outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
	print(outputs[0]["generated_text"])
	```