LeroyDyer commited on
Commit
efc54f2
1 Parent(s): fe69431

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -27,11 +27,12 @@ tags:
27
  ---
28
 
29
 
 
30
 
31
 
32
  GOOD ONE!
33
 
34
- This summary describes the latest language model (LLM), which is a merge of pre-trained language models using MergeKit.
35
 
36
  Merging these models is crucial for consolidating the internal predictive nature of the network. Each model undergoes different fine-tuning and adjustment to its weights, maintaining consistent size across models is essential. Despite using the Mistral transformer network as the base,
37
  it's worth noting that the merged models (Commercial Orca, Dolphin, Nous, Starling, etc.) may exhibit contamination,
@@ -41,7 +42,6 @@ Fine-tuning aims to adapt the LLM to new types of questions or tasks, but misali
41
  Future tuning will be tailored to specific tasks, leveraging the merged common models as a base. Observations on stability and performance of other models are welcomed for further refinement.
42
 
43
 
44
- ## LeroyDyer/Mixtral_AI_Cyber 5_7b
45
  This Expert is a companon to the MEGA_MIND 24b CyberSeries represents a groundbreaking leap in the realm of language models, integrating a diverse array of expert models into a unified framework. At its core lies the Mistral-7B-Instruct-v0.2, a refined instructional model designed for versatility and efficiency.
46
 
47
  Enhanced with an expanded context window and advanced routing mechanisms, the Mistral-7B-Instruct-v0.2 exemplifies the power of Mixture of Experts, allowing seamless integration of specialized sub-models. This architecture facilitates unparalleled performance and scalability, enabling the CyberSeries to tackle a myriad of tasks with unparalleled speed and accuracy.
 
27
  ---
28
 
29
 
30
+ ## LeroyDyer/Mixtral_AI_Cyber 5_7b
31
 
32
 
33
  GOOD ONE!
34
 
35
+
36
 
37
  Merging these models is crucial for consolidating the internal predictive nature of the network. Each model undergoes different fine-tuning and adjustment to its weights, maintaining consistent size across models is essential. Despite using the Mistral transformer network as the base,
38
  it's worth noting that the merged models (Commercial Orca, Dolphin, Nous, Starling, etc.) may exhibit contamination,
 
42
  Future tuning will be tailored to specific tasks, leveraging the merged common models as a base. Observations on stability and performance of other models are welcomed for further refinement.
43
 
44
 
 
45
  This Expert is a companon to the MEGA_MIND 24b CyberSeries represents a groundbreaking leap in the realm of language models, integrating a diverse array of expert models into a unified framework. At its core lies the Mistral-7B-Instruct-v0.2, a refined instructional model designed for versatility and efficiency.
46
 
47
  Enhanced with an expanded context window and advanced routing mechanisms, the Mistral-7B-Instruct-v0.2 exemplifies the power of Mixture of Experts, allowing seamless integration of specialized sub-models. This architecture facilitates unparalleled performance and scalability, enabling the CyberSeries to tackle a myriad of tasks with unparalleled speed and accuracy.