InferenceIllusionist commited on
Commit
bbefc3d
1 Parent(s): 813befd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -0
README.md CHANGED
@@ -18,6 +18,11 @@ Presenting a Model Stock experiment combining the unique strengths from the foll
18
  * Typhon-Mixtral-v1 / [Sao10K](https://huggingface.co/Sao10K) / Creative & Story Completion
19
  * Open_Gpt4_8x7B_v0.2 / [rombodawg](https://huggingface.co/rombodawg) / Conversational
20
 
 
 
 
 
 
21
  <H2>Methodology</H2>
22
 
23
  > [I]nnovative layer-wise weight averaging technique surpasses state-of-the-art model methods such as Model Soup, utilizing only two fine-tuned models. This strategy can be aptly coined Model Stock, highlighting its reliance on selecting a minimal number of models to draw a more optimized-averaged model
 
18
  * Typhon-Mixtral-v1 / [Sao10K](https://huggingface.co/Sao10K) / Creative & Story Completion
19
  * Open_Gpt4_8x7B_v0.2 / [rombodawg](https://huggingface.co/rombodawg) / Conversational
20
 
21
+ # Recommended Template
22
+ * Basic: Alpaca Format
23
+ * Advanced: See context/instruct/sampler settings in [our new Recommended Settings repo](https://huggingface.co/Quant-Cartel/Recommended-Settings/tree/main/Teto-MS-8x7b).
24
+ * Huge shout out to [rAIfle](https://huggingface.co/rAIfle) for his original work on the Wizard 8x22b templates which were modified for this model.
25
+
26
  <H2>Methodology</H2>
27
 
28
  > [I]nnovative layer-wise weight averaging technique surpasses state-of-the-art model methods such as Model Soup, utilizing only two fine-tuned models. This strategy can be aptly coined Model Stock, highlighting its reliance on selecting a minimal number of models to draw a more optimized-averaged model