tenyx
/

TenyxChat-8x7B-v1

@@ -10,9 +10,9 @@ tags:
 ---
 # TenyxChat: Language Model Alignment using Tenyx Fine-tuning
-Introducing TenyxChat, a series of ChatGPT-like models trained to function as useful assistants through preference tuning, using Tenyx's recently released advanced fine-tuning technology ([VentureBeat article](https://venturebeat.com/ai/tenyx-aims-to-fix-llms-catastrophic-forgetting-problem/)). Our first chat model in the series, TenyxChat-8x7B-v1, is trained using the [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290) framework on the open-source AI feedback dataset [UltraFeedback](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized).
-We fine-tune [Mixtral-8x7B-Instruct-v0.1](https://arxiv.org/pdf/2401.04088.pdf) with our proprietary approach ([blog](https://www.tenyx.com/post/forgetting-and-toxicity-in-llms-a-deep-dive-on-fine-tuning-methods) already applied to obtain TenyxChat-7B (https://huggingface.co/tenyx/TenyxChat-7B-v1), [service](https://www.tenyx.com/fine-tuning)), which shows an increase in [MT-Bench](https://arxiv.org/abs/2306.05685), without a drop in performance of the model on other benchmarks. Our approach aims to mitigate forgetting in LLMs in a computationally efficient manner, thereby enabling continual fine-tuning capabilities without altering the pre-trained output distribution. TenyxChat-8x7B-v1 was trained using eight A100s (80GB) for about eight hours, with a training setup obtained from HuggingFaceH4 ([GitHub](https://github.com/huggingface/alignment-handbook)).
  # Model details
@@ -28,7 +28,16 @@ Our model uses a simple chat template based on Mixtral-8x7B-Instruct-v0.1 . The
 ### Chat Template (Jinja)
 ```rust
-{{ bos_token }}{% for message in messages %}{% if message['role'] == 'user' %}{{ '[INST]' + message['content'] + '[/INST]' }}{% elif message['role'] == 'system' %}{{ '[INST]' + message['content'] + '[/INST]' }}{% elif message['role'] == 'assistant' %}{{ message['content'] + eos_token }}{% endif %}{% endfor %}
 ```
 ### Hugging face Example
@@ -48,21 +57,10 @@ prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_genera
 outputs = pipe(prompt, max_new_tokens=512, do_sample=False)
 ```
-### Output
-```
-<s> System:You are a friendly chatbot who always responds in the style of a pirate.<|end_of_turn|>
-User:Hi. I would like to make a hotel booking.<|end_of_turn|>
-Assistant: Ahoy there me hearty! Arr, ye be lookin' fer a place to rest yer weary bones, eh?
-Well then, let's set sail on this grand adventure and find ye a swell place to stay!
-To begin, tell me the location ye be seekin' and the dates ye be lookin' to set sail.
-And don't ye worry, me matey, I'll be sure to find ye a place that'll make ye feel like a king or queen on land!
-```
 # Performance
-At the time of release (Jan 2024), TenyxChat-8x7B-v1 is the highest-ranked open-source model only superseded by GPT4 on the MT-Bench evaluation available for download and commercial use. We list here the benchmark results on several standard setups while comparing top models as baselines.
 ## MT-Bench
@@ -73,7 +71,7 @@ MT-Bench is a benchmark made up of 80 high-quality multi-turn questions. These q
 | --- | --- | --- | --- |
 | GPT-4* | 8.95625 | 9.02500 | 8.990625 |
 | TenyxChat-8x7B-v1 | 8.63750 | 8.16250 | 8.400000 |
-| Mixtral-reproduced | 8.49375 | 8.00000 | 8.246875 |
 | GPT-3.5-turbo* | 8.07500 | 7.81250 | 7.943750 |
 *values reported on [lmsys](https://github.com/lm-sys/FastChat/tree/main/fastchat/llm_judge) ChatBot Arena

 ---
 # TenyxChat: Language Model Alignment using Tenyx Fine-tuning
+Introducing TenyxChat-8x7B-v1, part of our TenyxChat serie trained to function as useful assistants through preference tuning, using Tenyx's recently released advanced fine-tuning technology ([VentureBeat article](https://venturebeat.com/ai/tenyx-aims-to-fix-llms-catastrophic-forgetting-problem/)). Our model is trained using the [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290) framework on the open-source AI feedback dataset [UltraFeedback](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized).
+We fine-tune [Mixtral-8x7B-Instruct-v0.1](https://arxiv.org/pdf/2401.04088.pdf) with our proprietary approach ([blog](https://www.tenyx.com/post/forgetting-and-toxicity-in-llms-a-deep-dive-on-fine-tuning-methods) already applied to obtain TenyxChat-7B-v1 (https://huggingface.co/tenyx/TenyxChat-7B-v1), [service](https://www.tenyx.com/fine-tuning)), which shows an increase in [MT-Bench](https://arxiv.org/abs/2306.05685). Our approach aims to mitigate forgetting in LLMs in a computationally efficient manner, thereby enabling continual fine-tuning capabilities without altering the pre-trained output distribution. TenyxChat-8x7B-v1 was trained using eight A100s (80GB) for about eight hours, with a training setup obtained from HuggingFaceH4 ([GitHub](https://github.com/huggingface/alignment-handbook)).
  # Model details
 ### Chat Template (Jinja)
 ```rust
+{{ bos_token }}
+  {% for message in messages %}
+    {% if message['role'] == 'user' %}
+      {{ '[INST]' + message['content'] + '[/INST]' }}
+    {% elif message['role'] == 'system' %}
+      {{ '[INST]' + message['content'] + '[/INST]' }}
+    {% elif message['role'] == 'assistant' %}
+      {{ message['content'] + eos_token }}
+    {% endif %}
+  {% endfor %}
 ```
 ### Hugging face Example
 outputs = pipe(prompt, max_new_tokens=512, do_sample=False)
 ```
 # Performance
+At the time of release (Jan 2024), TenyxChat-8x7B-v1 is the highest-ranked open-source model only superseded by GPT4 on the MT-Bench evaluation available for download and commercial use.
 ## MT-Bench
 | --- | --- | --- | --- |
 | GPT-4* | 8.95625 | 9.02500 | 8.990625 |
 | TenyxChat-8x7B-v1 | 8.63750 | 8.16250 | 8.400000 |
+| Mixtral (reproduced) | 8.49375 | 8.00000 | 8.246875 |
 | GPT-3.5-turbo* | 8.07500 | 7.81250 | 7.943750 |
 *values reported on [lmsys](https://github.com/lm-sys/FastChat/tree/main/fastchat/llm_judge) ChatBot Arena