Romain-Cosentino commited on
Commit
1cc682b
1 Parent(s): 277cd6f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -16
README.md CHANGED
@@ -10,9 +10,9 @@ tags:
10
  ---
11
  # TenyxChat: Language Model Alignment using Tenyx Fine-tuning
12
 
13
- Introducing TenyxChat, a series of ChatGPT-like models trained to function as useful assistants through preference tuning, using Tenyx's recently released advanced fine-tuning technology ([VentureBeat article](https://venturebeat.com/ai/tenyx-aims-to-fix-llms-catastrophic-forgetting-problem/)). Our first chat model in the series, TenyxChat-8x7B-v1, is trained using the [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290) framework on the open-source AI feedback dataset [UltraFeedback](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized).
14
 
15
- We fine-tune [Mixtral-8x7B-Instruct-v0.1](https://arxiv.org/pdf/2401.04088.pdf) with our proprietary approach ([blog](https://www.tenyx.com/post/forgetting-and-toxicity-in-llms-a-deep-dive-on-fine-tuning-methods) already applied to obtain TenyxChat-7B (https://huggingface.co/tenyx/TenyxChat-7B-v1), [service](https://www.tenyx.com/fine-tuning)), which shows an increase in [MT-Bench](https://arxiv.org/abs/2306.05685), without a drop in performance of the model on other benchmarks. Our approach aims to mitigate forgetting in LLMs in a computationally efficient manner, thereby enabling continual fine-tuning capabilities without altering the pre-trained output distribution. TenyxChat-8x7B-v1 was trained using eight A100s (80GB) for about eight hours, with a training setup obtained from HuggingFaceH4 ([GitHub](https://github.com/huggingface/alignment-handbook)).
16
 
17
  # Model details
18
 
@@ -28,7 +28,16 @@ Our model uses a simple chat template based on Mixtral-8x7B-Instruct-v0.1 . The
28
  ### Chat Template (Jinja)
29
 
30
  ```rust
31
- {{ bos_token }}{% for message in messages %}{% if message['role'] == 'user' %}{{ '[INST]' + message['content'] + '[/INST]' }}{% elif message['role'] == 'system' %}{{ '[INST]' + message['content'] + '[/INST]' }}{% elif message['role'] == 'assistant' %}{{ message['content'] + eos_token }}{% endif %}{% endfor %}
 
 
 
 
 
 
 
 
 
32
  ```
33
 
34
  ### Hugging face Example
@@ -48,21 +57,10 @@ prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_genera
48
  outputs = pipe(prompt, max_new_tokens=512, do_sample=False)
49
  ```
50
 
51
- ### Output
52
-
53
- ```
54
- <s> System:You are a friendly chatbot who always responds in the style of a pirate.<|end_of_turn|>
55
- User:Hi. I would like to make a hotel booking.<|end_of_turn|>
56
- Assistant: Ahoy there me hearty! Arr, ye be lookin' fer a place to rest yer weary bones, eh?
57
- Well then, let's set sail on this grand adventure and find ye a swell place to stay!
58
-
59
- To begin, tell me the location ye be seekin' and the dates ye be lookin' to set sail.
60
- And don't ye worry, me matey, I'll be sure to find ye a place that'll make ye feel like a king or queen on land!
61
- ```
62
 
63
  # Performance
64
 
65
- At the time of release (Jan 2024), TenyxChat-8x7B-v1 is the highest-ranked open-source model only superseded by GPT4 on the MT-Bench evaluation available for download and commercial use. We list here the benchmark results on several standard setups while comparing top models as baselines.
66
 
67
  ## MT-Bench
68
 
@@ -73,7 +71,7 @@ MT-Bench is a benchmark made up of 80 high-quality multi-turn questions. These q
73
  | --- | --- | --- | --- |
74
  | GPT-4* | 8.95625 | 9.02500 | 8.990625 |
75
  | TenyxChat-8x7B-v1 | 8.63750 | 8.16250 | 8.400000 |
76
- | Mixtral-reproduced | 8.49375 | 8.00000 | 8.246875 |
77
  | GPT-3.5-turbo* | 8.07500 | 7.81250 | 7.943750 |
78
 
79
  *values reported on [lmsys](https://github.com/lm-sys/FastChat/tree/main/fastchat/llm_judge) ChatBot Arena
 
10
  ---
11
  # TenyxChat: Language Model Alignment using Tenyx Fine-tuning
12
 
13
+ Introducing TenyxChat-8x7B-v1, part of our TenyxChat serie trained to function as useful assistants through preference tuning, using Tenyx's recently released advanced fine-tuning technology ([VentureBeat article](https://venturebeat.com/ai/tenyx-aims-to-fix-llms-catastrophic-forgetting-problem/)). Our model is trained using the [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290) framework on the open-source AI feedback dataset [UltraFeedback](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized).
14
 
15
+ We fine-tune [Mixtral-8x7B-Instruct-v0.1](https://arxiv.org/pdf/2401.04088.pdf) with our proprietary approach ([blog](https://www.tenyx.com/post/forgetting-and-toxicity-in-llms-a-deep-dive-on-fine-tuning-methods) already applied to obtain TenyxChat-7B-v1 (https://huggingface.co/tenyx/TenyxChat-7B-v1), [service](https://www.tenyx.com/fine-tuning)), which shows an increase in [MT-Bench](https://arxiv.org/abs/2306.05685). Our approach aims to mitigate forgetting in LLMs in a computationally efficient manner, thereby enabling continual fine-tuning capabilities without altering the pre-trained output distribution. TenyxChat-8x7B-v1 was trained using eight A100s (80GB) for about eight hours, with a training setup obtained from HuggingFaceH4 ([GitHub](https://github.com/huggingface/alignment-handbook)).
16
 
17
  # Model details
18
 
 
28
  ### Chat Template (Jinja)
29
 
30
  ```rust
31
+ {{ bos_token }}
32
+ {% for message in messages %}
33
+ {% if message['role'] == 'user' %}
34
+ {{ '[INST]' + message['content'] + '[/INST]' }}
35
+ {% elif message['role'] == 'system' %}
36
+ {{ '[INST]' + message['content'] + '[/INST]' }}
37
+ {% elif message['role'] == 'assistant' %}
38
+ {{ message['content'] + eos_token }}
39
+ {% endif %}
40
+ {% endfor %}
41
  ```
42
 
43
  ### Hugging face Example
 
57
  outputs = pipe(prompt, max_new_tokens=512, do_sample=False)
58
  ```
59
 
 
 
 
 
 
 
 
 
 
 
 
60
 
61
  # Performance
62
 
63
+ At the time of release (Jan 2024), TenyxChat-8x7B-v1 is the highest-ranked open-source model only superseded by GPT4 on the MT-Bench evaluation available for download and commercial use.
64
 
65
  ## MT-Bench
66
 
 
71
  | --- | --- | --- | --- |
72
  | GPT-4* | 8.95625 | 9.02500 | 8.990625 |
73
  | TenyxChat-8x7B-v1 | 8.63750 | 8.16250 | 8.400000 |
74
+ | Mixtral (reproduced) | 8.49375 | 8.00000 | 8.246875 |
75
  | GPT-3.5-turbo* | 8.07500 | 7.81250 | 7.943750 |
76
 
77
  *values reported on [lmsys](https://github.com/lm-sys/FastChat/tree/main/fastchat/llm_judge) ChatBot Arena