sarath-shekkizhar commited on
Commit
ee09ed9
1 Parent(s): 6166b71

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -13,7 +13,7 @@ tags:
13
  Introducing TenyxChat-8x7B-v1, part of our TenyxChat series trained to function as useful assistants through preference tuning, using Tenyx's recently released advanced fine-tuning technology ([VentureBeat article](https://venturebeat.com/ai/tenyx-aims-to-fix-llms-catastrophic-forgetting-problem/)). Our model is trained using the [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290) framework on the open-source AI feedback dataset [UltraFeedback](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized).
14
 
15
  We fine-tune [Mixtral-8x7B-Instruct-v0.1](https://arxiv.org/pdf/2401.04088.pdf) with our proprietary approach ([blog](https://www.tenyx.com/post/forgetting-and-toxicity-in-llms-a-deep-dive-on-fine-tuning-methods), [service](https://www.tenyx.com/fine-tuning)),
16
- similar to that of our [7B model](already applied to obtain TenyxChat-7B-v1 (https://huggingface.co/tenyx/TenyxChat-7B-v1), and show an increase in [MT-Bench](https://arxiv.org/abs/2306.05685) scores.
17
  Our approach aims to mitigate forgetting in LLMs in a computationally efficient manner, thereby enabling continual fine-tuning capabilities without altering the pre-trained output distribution.
18
  TenyxChat-8x7B-v1 was trained using eight A100s (80GB) for about eight hours, with a training setup obtained from HuggingFaceH4 ([GitHub](https://github.com/huggingface/alignment-handbook)).
19
 
 
13
  Introducing TenyxChat-8x7B-v1, part of our TenyxChat series trained to function as useful assistants through preference tuning, using Tenyx's recently released advanced fine-tuning technology ([VentureBeat article](https://venturebeat.com/ai/tenyx-aims-to-fix-llms-catastrophic-forgetting-problem/)). Our model is trained using the [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290) framework on the open-source AI feedback dataset [UltraFeedback](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized).
14
 
15
  We fine-tune [Mixtral-8x7B-Instruct-v0.1](https://arxiv.org/pdf/2401.04088.pdf) with our proprietary approach ([blog](https://www.tenyx.com/post/forgetting-and-toxicity-in-llms-a-deep-dive-on-fine-tuning-methods), [service](https://www.tenyx.com/fine-tuning)),
16
+ similar to that of our [7B model](https://huggingface.co/tenyx/TenyxChat-7B-v1), and show an increase in [MT-Bench](https://arxiv.org/abs/2306.05685) scores.
17
  Our approach aims to mitigate forgetting in LLMs in a computationally efficient manner, thereby enabling continual fine-tuning capabilities without altering the pre-trained output distribution.
18
  TenyxChat-8x7B-v1 was trained using eight A100s (80GB) for about eight hours, with a training setup obtained from HuggingFaceH4 ([GitHub](https://github.com/huggingface/alignment-handbook)).
19