InferenceIllusionist
/

TeTO-MS-8x7b

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

InferenceIllusionist commited on Jun 11

Commit

bbefc3d

•

1 Parent(s): 813befd

Update README.md

Files changed (1) hide show

README.md +5 -0

README.md CHANGED Viewed

@@ -18,6 +18,11 @@ Presenting a Model Stock experiment combining the unique strengths from the foll
 * Typhon-Mixtral-v1 / [Sao10K](https://huggingface.co/Sao10K) / Creative & Story Completion
 * Open_Gpt4_8x7B_v0.2 / [rombodawg](https://huggingface.co/rombodawg) / Conversational
 <H2>Methodology</H2>
 > [I]nnovative layer-wise weight averaging technique surpasses state-of-the-art model methods such as Model Soup, utilizing only two fine-tuned models. This strategy can be aptly coined Model Stock, highlighting its reliance on selecting a minimal number of models to draw a more optimized-averaged model

 * Typhon-Mixtral-v1 / [Sao10K](https://huggingface.co/Sao10K) / Creative & Story Completion
 * Open_Gpt4_8x7B_v0.2 / [rombodawg](https://huggingface.co/rombodawg) / Conversational
+# Recommended Template
+* Basic: Alpaca Format
+* Advanced: See context/instruct/sampler settings in [our new Recommended Settings repo](https://huggingface.co/Quant-Cartel/Recommended-Settings/tree/main/Teto-MS-8x7b).
+* Huge shout out to [rAIfle](https://huggingface.co/rAIfle) for his original work on the Wizard 8x22b templates which were modified for this model.
 <H2>Methodology</H2>
 > [I]nnovative layer-wise weight averaging technique surpasses state-of-the-art model methods such as Model Soup, utilizing only two fine-tuned models. This strategy can be aptly coined Model Stock, highlighting its reliance on selecting a minimal number of models to draw a more optimized-averaged model