File size: 2,866 Bytes
51a0cbd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24c7278
51a0cbd
24c7278
 
050395c
24c7278
050395c
24c7278
050395c
 
51a0cbd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
---
license: apache-2.0
tags:
- moe
- frankenmoe
- merge
- mergekit
- lazymergekit
- Locutusque/Hercules-4.0-Mistral-v0.2-7B
- Open-Orca/Mistral-7B-OpenOrca
base_model:
- Locutusque/Hercules-4.0-Mistral-v0.2-7B
- Open-Orca/Mistral-7B-OpenOrca
---

# seldonium-2x7b-MoE-v0.1

seldonium-2x7b-MoE-v0.1-coder-logic is a Mixture of Experts (MoE) model that combines the capabilities of two specialized language models:

Locutusque/Hercules-4.0-Mistral-v0.2-7B: A 7B parameter model focused on programming tasks, such as writing functions, implementing algorithms, and working with data structures.

Open-Orca/Mistral-7B-OpenOrca: A 7B parameter model focused on logical reasoning and analysis, including solving logic problems, evaluating arguments, and assessing the validity of statements.

This MoE model was created using the LazyMergekit colab, which allows for efficient combination of specialized models to produce a more capable and efficient overall model. 
The seldonium-2x3b-MoE-v0.1 can be used for a variety of natural language processing tasks that benefit from the complementary strengths of its expert components.

## 🧩 Configuration

```yaml
base_model: NousResearch/Hermes-2-Pro-Mistral-7B
gate_mode: cheap_embed  # Use hidden state representations to determine MoE gates
dtype: bfloat16  # Output data type
experts_per_token: 2  # Number of experts per token
experts:
  - source_model: Locutusque/Hercules-4.0-Mistral-v0.2-7B
    positive_prompts:
      - "Write a Python function to calculate the factorial of a number."
      - "Implement a quicksort algorithm to sort a list of integers."
      - "Design a Python class to represent a binary search tree."
  
  - source_model: Open-Orca/Mistral-7B-OpenOrca
    positive_prompts:
      - "Solve the logic puzzle: 'If Ann is older than Belinda, and Belinda is younger than Cathy, who is the oldest?'"
      - "Analyze the argument: 'All cats are animals. Some animals are pets. Therefore, all cats are pets.' Determine if the conclusion follows logically from the premises."
      - "Evaluate the validity of the statements: 'A is true. A is false.'"
```

## 💻 Usage

```python
!pip install -qU transformers bitsandbytes accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "jomangbp/seldonium-2x3b-MoE-v0.1"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True},
)

messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
```