|
--- |
|
base_model: |
|
- LeroyDyer/Mixtral_AI_Cyber_Orca |
|
- LeroyDyer/Mixtral_AI_Cyber_4.0 |
|
- LeroyDyer/Mixtral_AI_Cyber_4.0_m1 |
|
- LeroyDyer/Mixtral_AI_Cyber_Dolphin |
|
- LeroyDyer/Mixtral_AI_Cyber_4_m1_SFT |
|
- LeroyDyer/Mixtral_AI_Cyber_3.m2 |
|
library_name: transformers |
|
license: apache-2.0 |
|
language: |
|
- en |
|
datasets: |
|
- cognitivecomputations/dolphin |
|
- Open-Orca/OpenOrca |
|
metrics: |
|
- accuracy |
|
- code_eval |
|
- bertscore |
|
- bleu |
|
- bleurt |
|
- brier_score |
|
tags: |
|
- legal |
|
- medical |
|
- not-for-all-audiences |
|
--- |
|
|
|
|
|
|
|
GOOD ONE! |
|
|
|
This summary describes the latest language model (LLM), which is a merge of pre-trained language models using MergeKit. |
|
|
|
Merging these models is crucial for consolidating the internal predictive nature of the network. Each model undergoes different fine-tuning and adjustment to its weights, maintaining consistent size across models is essential. Despite using the Mistral transformer network as the base, |
|
it's worth noting that the merged models (Commercial Orca, Dolphin, Nous, Starling, etc.) may exhibit contamination, |
|
leading to some questions being already present in the dataset and potential biases towards the creator's personal psychometric understanding of the world. |
|
Fine-tuning aims to adapt the LLM to new types of questions or tasks, but misalignment during this process can result in erroneous text outputs. |
|
|
|
Future tuning will be tailored to specific tasks, leveraging the merged common models as a base. Observations on stability and performance of other models are welcomed for further refinement. |