|
--- |
|
base_model: |
|
- LeroyDyer/Mixtral_AI_Cyber_Orca |
|
- LeroyDyer/Mixtral_AI_Cyber_4.0 |
|
- LeroyDyer/Mixtral_AI_Cyber_4.0_m1 |
|
- LeroyDyer/Mixtral_AI_Cyber_Dolphin |
|
- LeroyDyer/Mixtral_AI_Cyber_4_m1_SFT |
|
- LeroyDyer/Mixtral_AI_Cyber_3.m2 |
|
library_name: transformers |
|
license: apache-2.0 |
|
language: |
|
- en |
|
datasets: |
|
- cognitivecomputations/dolphin |
|
- Open-Orca/OpenOrca |
|
metrics: |
|
- accuracy |
|
- code_eval |
|
- bertscore |
|
- bleu |
|
- bleurt |
|
- brier_score |
|
tags: |
|
- legal |
|
- medical |
|
--- |
|
|
|
|
|
|
|
|
|
|
|
## LeroyDyer/Mixtral_AI_Cyber 5_7b |
|
|
|
<img src="https://cdn-avatars.huggingface.co/v1/production/uploads/65d883893a52cd9bcd8ab7cf/tRsCJlHNZo1D02kBTmfy9.jpeg" width="300"/> |
|
https://github.com/spydaz |
|
GOOD ONE! |
|
|
|
|
|
|
|
Merging these models is crucial for consolidating the internal predictive nature of the network. Each model undergoes different fine-tuning and adjustment to its weights, maintaining consistent size across models is essential. Despite using the Mistral transformer network as the base, |
|
it's worth noting that the merged models (Commercial Orca, Dolphin, Nous, Starling, etc.) may exhibit contamination, |
|
leading to some questions being already present in the dataset and potential biases towards the creator's personal psychometric understanding of the world. |
|
Fine-tuning aims to adapt the LLM to new types of questions or tasks, but misalignment during this process can result in erroneous text outputs. |
|
|
|
Future tuning will be tailored to specific tasks, leveraging the merged common models as a base. Observations on stability and performance of other models are welcomed for further refinement. |
|
|
|
|
|
This Expert is a companon to the MEGA_MIND 24b CyberSeries represents a groundbreaking leap in the realm of language models, integrating a diverse array of expert models into a unified framework. At its core lies the Mistral-7B-Instruct-v0.2, a refined instructional model designed for versatility and efficiency. |
|
|
|
Enhanced with an expanded context window and advanced routing mechanisms, the Mistral-7B-Instruct-v0.2 exemplifies the power of Mixture of Experts, allowing seamless integration of specialized sub-models. This architecture facilitates unparalleled performance and scalability, enabling the CyberSeries to tackle a myriad of tasks with unparalleled speed and accuracy. |
|
|
|
Among its illustrious sub-models, the OpenOrca - Mistral-7B-8k shines as a testament to fine-tuning excellence, boasting top-ranking performance in its class. Meanwhile, the Hermes 2 Pro introduces cutting-edge capabilities such as Function Calling and JSON Mode, catering to diverse application needs. |
|
|
|
Driven by Reinforcement Learning from AI Feedback, the Starling-LM-7B-beta demonstrates remarkable adaptability and optimization, while the Phi-1.5 Transformer model stands as a beacon of excellence across various domains, from common sense reasoning to medical inference. |
|
|
|
With models like BioMistral tailored specifically for medical applications and Nous-Yarn-Mistral-7b-128k excelling in handling long-context data, the MEGA_MIND 24b CyberSeries emerges as a transformative force in the landscape of language understanding and artificial intelligence. |
|
|
|
Experience the future of language models with the MEGA_MIND 24b CyberSeries, where innovation meets performance, and possibilities are limitless. |
|
|