InferenceIllusionist
commited on
Commit
•
bbefc3d
1
Parent(s):
813befd
Update README.md
Browse files
README.md
CHANGED
@@ -18,6 +18,11 @@ Presenting a Model Stock experiment combining the unique strengths from the foll
|
|
18 |
* Typhon-Mixtral-v1 / [Sao10K](https://huggingface.co/Sao10K) / Creative & Story Completion
|
19 |
* Open_Gpt4_8x7B_v0.2 / [rombodawg](https://huggingface.co/rombodawg) / Conversational
|
20 |
|
|
|
|
|
|
|
|
|
|
|
21 |
<H2>Methodology</H2>
|
22 |
|
23 |
> [I]nnovative layer-wise weight averaging technique surpasses state-of-the-art model methods such as Model Soup, utilizing only two fine-tuned models. This strategy can be aptly coined Model Stock, highlighting its reliance on selecting a minimal number of models to draw a more optimized-averaged model
|
|
|
18 |
* Typhon-Mixtral-v1 / [Sao10K](https://huggingface.co/Sao10K) / Creative & Story Completion
|
19 |
* Open_Gpt4_8x7B_v0.2 / [rombodawg](https://huggingface.co/rombodawg) / Conversational
|
20 |
|
21 |
+
# Recommended Template
|
22 |
+
* Basic: Alpaca Format
|
23 |
+
* Advanced: See context/instruct/sampler settings in [our new Recommended Settings repo](https://huggingface.co/Quant-Cartel/Recommended-Settings/tree/main/Teto-MS-8x7b).
|
24 |
+
* Huge shout out to [rAIfle](https://huggingface.co/rAIfle) for his original work on the Wizard 8x22b templates which were modified for this model.
|
25 |
+
|
26 |
<H2>Methodology</H2>
|
27 |
|
28 |
> [I]nnovative layer-wise weight averaging technique surpasses state-of-the-art model methods such as Model Soup, utilizing only two fine-tuned models. This strategy can be aptly coined Model Stock, highlighting its reliance on selecting a minimal number of models to draw a more optimized-averaged model
|