Update README.md
Browse files
README.md
CHANGED
@@ -17,4 +17,11 @@ language:
|
|
17 |
|
18 |
This summary describes the latest language model (LLM), which is a merge of pre-trained language models using MergeKit.
|
19 |
|
20 |
-
Merging these models is crucial for consolidating the internal predictive nature of the network. Each model undergoes different fine-tuning and adjustment to its weights, maintaining consistent size across models is essential. Despite using the Mistral transformer network as the base,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
17 |
|
18 |
This summary describes the latest language model (LLM), which is a merge of pre-trained language models using MergeKit.
|
19 |
|
20 |
+
Merging these models is crucial for consolidating the internal predictive nature of the network. Each model undergoes different fine-tuning and adjustment to its weights, maintaining consistent size across models is essential. Despite using the Mistral transformer network as the base,
|
21 |
+
it's worth noting that the merged models (Commercial Orca, Dolphin, Nous, Starling, etc.) may exhibit contamination,
|
22 |
+
leading to some questions being already present in the dataset and potential biases towards the creator's personal psychometric understanding of the world.
|
23 |
+
Fine-tuning aims to adapt the LLM to new types of questions or tasks, but misalignment during this process can result in erroneous text outputs.
|
24 |
+
|
25 |
+
Future tuning will be tailored to specific tasks, leveraging the merged common models as a base. Observations on stability and performance of other models are welcomed for further refinement.
|
26 |
+
|
27 |
+
|