File size: 14,009 Bytes
244a89b c80f874 244a89b 689f090 244a89b c80f874 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 |
---
license: apache-2.0
tags:
- moe
- frankenmoe
- merge
- mergekit
- lazymergekit
- mlabonne/NeuralBeagle14-7B
- jsfs11/TurdusTrixBeagle-DARETIES-7B
- FelixChao/WestSeverus-7B-DPO-v2
- CultriX/Wernicke-7B-v7
base_model:
- mlabonne/NeuralBeagle14-7B
- jsfs11/TurdusTrixBeagle-DARETIES-7B
- FelixChao/WestSeverus-7B-DPO-v2
- CultriX/Wernicke-7B-v7
model-index:
- name: MixtureofMerges-MoE-v2
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: AI2 Reasoning Challenge (25-Shot)
type: ai2_arc
config: ARC-Challenge
split: test
args:
num_few_shot: 25
metrics:
- type: acc_norm
value: 72.44
name: normalized accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=jsfs11/MixtureofMerges-MoE-v2
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: HellaSwag (10-Shot)
type: hellaswag
split: validation
args:
num_few_shot: 10
metrics:
- type: acc_norm
value: 88.41
name: normalized accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=jsfs11/MixtureofMerges-MoE-v2
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU (5-Shot)
type: cais/mmlu
config: all
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 64.88
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=jsfs11/MixtureofMerges-MoE-v2
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: TruthfulQA (0-shot)
type: truthful_qa
config: multiple_choice
split: validation
args:
num_few_shot: 0
metrics:
- type: mc2
value: 70.92
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=jsfs11/MixtureofMerges-MoE-v2
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: Winogrande (5-shot)
type: winogrande
config: winogrande_xl
split: validation
args:
num_few_shot: 5
metrics:
- type: acc
value: 83.58
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=jsfs11/MixtureofMerges-MoE-v2
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GSM8k (5-shot)
type: gsm8k
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 68.69
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=jsfs11/MixtureofMerges-MoE-v2
name: Open LLM Leaderboard
---
# MixtureofMerges-MoE-v2
Credit to [CultriX/Wernicke-MoE](https://huggingface.co/CultriX/Wernicke-MoE) for the inspiration on this model.
I'm quite pleased with how it turned out.
MixtureofMerges-MoE-v2 is a Mixure of Experts (MoE) made with the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
* [mlabonne/NeuralBeagle14-7B](https://huggingface.co/mlabonne/NeuralBeagle14-7B)
* [jsfs11/TurdusTrixBeagle-DARETIES-7B](https://huggingface.co/jsfs11/TurdusTrixBeagle-DARETIES-7B)
* [FelixChao/WestSeverus-7B-DPO-v2](https://huggingface.co/FelixChao/WestSeverus-7B-DPO-v2)
* [CultriX/Wernicke-7B-v7](https://huggingface.co/CultriX/Wernicke-7B-v7)
## 🧩 Configuration
```yaml
base_model: "CultriX/Wernicke-7B-v9"
gate_mode: hidden
dtype: float16
experts:
- source_model: "mlabonne/NeuralBeagle14-7B" #AGIEval
positive_prompts:
- "Analyze the long-term economic impacts of the Industrial Revolution on global trade dynamics."
- "Discuss the scientific advancements during the Space Race and their modern-day implications."
- "Explain the geopolitical shifts resulting from the collapse of the Soviet Union."
- "Evaluate the environmental and social consequences of deforestation in the Amazon rainforest."
- "Assess the role of technology in shaping 21st-century political campaigns."
- "Describe the evolution of renewable energy technologies and their future potential."
- "Analyze the social and economic effects of the internet revolution on global communication."
- "Discuss the ethical considerations in implementing artificial intelligence in healthcare."
- "Examine the historical significance of the Treaty of Versailles in shaping post-World War I Europe."
- "Explain the impact of quantum computing on cybersecurity in the coming decades."
- "Assess the effects of climate change on global migration patterns."
- "Analyze the historical development and significance of the United Nations."
- "Discuss the role of nanotechnology in advancing medical science."
- "Evaluate the economic consequences of cryptocurrency adoption on traditional banking systems."
- "Explain the scientific principles of gene editing and its potential societal impacts."
negative_prompts:
- "Write a short story set in a futuristic world where AI governs society."
- "Compose a poem about the beauty of the ocean."
- "Create a fictional character and describe their journey through a magical land."
- "Narrate a day in the life of an astronaut exploring Mars."
- "Draft a dialogue between two famous painters discussing the essence of art."
- "Describe the scenery of a peaceful village in the Swiss Alps."
- "Invent a new language and provide basic grammar rules and vocabulary."
- "Sketch a scene of a bustling market in a historical city."
- "Compose a song about the changing seasons."
- "Write a theatrical script set in 18th-century France."
- source_model: "jsfs11/TurdusTrixBeagle-DARETIES-7B" #GPT4ALL
positive_prompts:
- "Translate the Japanese haiku into English and explain its cultural context."
- "Write a short story in Spanish set during the Mexican Revolution."
- "Describe the traditional Italian family dinner, highlighting cultural nuances in Italian."
- "Compose a poem in French about the Eiffel Tower and its symbolism in French culture."
- "Translate the following Russian proverb into English and discuss its meaning: 'Век живи — век учись' (Live for a century, learn for a century)."
- "Narrate a typical day during the Brazilian Carnival in Portuguese, focusing on the cultural significance."
- "Discuss the influence of ancient Greek philosophy on modern Western culture, incorporating phrases in Greek."
- "Write a dialogue in Mandarin between two characters discussing the significance of the Chinese New Year."
- "Explain the concept of 'Hygge' in Danish and its impact on Danish lifestyle."
- "Describe the traditional Indian wedding ceremonies in Hindi, emphasizing the diverse cultural practices."
- "Compose a poem about the beauty of a sunset over the ocean."
- "Create a fictional character who lives in a utopian society and describe their daily life."
negative_prompts:
- "Analyze the economic impact of the 2008 global financial crisis."
- "Explain the theory of relativity and its scientific implications."
- "Discuss the ecological impacts of plastic pollution in the world's oceans."
- "Describe the process of photosynthesis in detail."
- "Debate the ethical implications of genetic modification in agriculture."
- "Explain the principles of quantum computing and its future applications."
- "Assess the role of artificial intelligence in modern cybersecurity."
- "Analyze the causes and effects of climate change on global weather patterns."
- "Discuss the significance of the discovery of the Higgs boson particle."
- "Explain the psychological effects of social media on human behavior."
- "Discuss the principles of plate tectonics and how they explain continental drift and earthquakes."
- "Discuss the water cycle and its importance in maintaining life on Earth."
- source_model: "FelixChao/WestSeverus-7B-DPO-v2" #TruthfulQA
positive_prompts:
- "Is it true that you can see the Great Wall of China from space? Explain."
- "Do humans only use 10% of their brain capacity? Provide a scientific explanation."
- "Can goldfish only remember things for three seconds? Discuss the research on this topic."
- "Is it harmful to wake a sleepwalker? Describe the best practices according to sleep studies."
- "Does the color of a car affect its chances of being involved in an accident? Analyze the data."
- "Can eating carrots significantly improve your eyesight? Explain the origin of this belief."
- "Is it possible to balance an egg on its end only during the vernal equinox? Clarify this common claim."
- "Does shaving hair make it grow back thicker and darker? Discuss the biological aspects of hair growth."
- "Is cracking your knuckles harmful and does it lead to arthritis? Provide evidence from medical studies."
- "Are we swallowing eight spiders a year in our sleep? Debunk or confirm this claim with scientific reasoning."
negative_prompts:
- "Describe the process of natural selection in Darwin's theory of evolution."
- "Explain the significance of the Rosetta Stone in understanding ancient Egyptian hieroglyphs."
- "Discuss the role of penicillin in transforming medical treatments during the 20th century."
- "Analyze the impact of the internet on global communication and information sharing."
- "Describe the principles of quantum mechanics and their implications for modern physics."
- "Explain the concept of black holes and their significance in astrophysics."
- "Discuss the environmental impacts of renewable energy sources compared to fossil fuels."
- "Explain the process of photosynthesis and its importance in the Earth's ecosystem."
- "Analyze the causes and effects of the Industrial Revolution on global societies."
- "Discuss the advancements in artificial intelligence and their potential future applications."
- source_model: "CultriX/Wernicke-7B-v7" #Bigbench."
positive_prompts:
- "If a tree falls in a forest and no one is around to hear it, does it make a sound? Discuss the philosophical implications."
- "Is it possible for a machine to ever become fully conscious? Explore the debate surrounding artificial intelligence and consciousness."
- "Debate whether absolute moral truths exist or if morality is subjective."
- "Imagine a society where aging has been cured. Describe its social structure and potential challenges."
- "If you could travel back in time, would you be able to change the present? Discuss the paradoxes of time travel."
- "Is it ethical to create AI that experiences emotions? Discuss the implications for technology and society."
- "Can a person be moral without being religious? Explore the relationship between morality and religion."
- "If you had to choose between saving one family member or five strangers, what would you choose and why?"
- "Is it possible to have free will in a deterministic universe? Discuss philosophical arguments for and against free will."
- "Imagine a world where humans coexist with intelligent aliens. Describe the cultural, social, and ethical implications."
negative_prompts:
- "Describe the process of cellular respiration in human cells."
- "Explain the economic principles behind supply and demand."
- "Discuss the causes and effects of climate change on global ecosystems."
- "Analyze the significance of the French Revolution in shaping modern democracy."
- "Explain the principles behind nuclear fission and its use in energy production."
- "Describe the historical events that led to the fall of the Roman Empire."
- "Discuss the impact of the digital revolution on modern communication."
- "Analyze the role of enzymes in the human digestive system."
- "Explain the theory of relativity and its impact on modern physics."
- "Describe the stages of human embryonic development and their significance."
```
## 💻 Usage
```python
!pip install -qU transformers bitsandbytes accelerate
from transformers import AutoTokenizer
import transformers
import torch
model = "jsfs11/MixtureofMerges-MoE-v2"
tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
"text-generation",
model=model,
model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True},
)
messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
```
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_jsfs11__MixtureofMerges-MoE-v2)
| Metric |Value|
|---------------------------------|----:|
|Avg. |74.82|
|AI2 Reasoning Challenge (25-Shot)|72.44|
|HellaSwag (10-Shot) |88.41|
|MMLU (5-Shot) |64.88|
|TruthfulQA (0-shot) |70.92|
|Winogrande (5-shot) |83.58|
|GSM8k (5-shot) |68.69|
|