tenyx
/

Llama3-TenyxChat-70B

@@ -22,7 +22,7 @@ Our approach aims to mitigate forgetting in LLMs in a computationally efficient
 thereby enabling continual fine-tuning capabilities without altering the pre-trained output distribution.
 Llama-3-TenyxChat-70B was trained using eight A100s (80GB) for fifteen hours, with a training setup obtained from HuggingFaceH4 ([GitHub](https://github.com/huggingface/alignment-handbook)).
-*The MT-Bench evaluation we perform follows the latest eval upgrade as PR'd [here](https://github.com/lm-sys/FastChat/pull/3158). This PR upgrades the evaluation from `GPT-4-0613` to `GPT-4-preview-0125` (latest version) as well as corrects and improves the quality of the reference answers for a subset of questions. These changes are required to correct the erroneous rating during the evaluation.
 **Model Developers** [Tenyx Research](https://www.tenyx.com/research)
@@ -30,7 +30,7 @@ Llama-3-TenyxChat-70B was trained using eight A100s (80GB) for fifteen hours, wi
 # Model details
-- Model type: Fine-tuned Mixture Of Expert 8x7B model for chat.
 - License: Meta Llama 3 Community License
 - Base model: [Llama3-70B](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct)
 - Demo: Coming Soon!
@@ -63,9 +63,9 @@ At the time of release (April 2024), Llama3-TenyxChat-70B is the highest-ranked
 ## MT-Bench
-MT-Bench is a benchmark made up of 80 high-quality multi-turn questions. These questions fall into eight categories: Writing, Roleplay, Reasoning, Math, Coding, Extraction, STEM, and Humanities. The chat models are rated using GPT-4 on a scale of 1 to 10, with higher values corresponding to better responses.
-| Model-name                     | GPT4-0125-preview MT Bench | Chat Arena Elo |
 |--------------------------------|----------------------------|----------------|
 | GPT-4-1106                     | 8.79                       | 1251           |
 | Claude 3 Opus (20240229)       | 8.57                       | 1247           |

 thereby enabling continual fine-tuning capabilities without altering the pre-trained output distribution.
 Llama-3-TenyxChat-70B was trained using eight A100s (80GB) for fifteen hours, with a training setup obtained from HuggingFaceH4 ([GitHub](https://github.com/huggingface/alignment-handbook)).
+*The MT-Bench evaluation we perform follows the latest eval upgrade as PR'd [here](https://github.com/lm-sys/FastChat/pull/3158). This PR upgrades the evaluation from `GPT-4-0613` to `GPT-4-preview-0125` (latest version) as well as corrects and improves the quality of the reference answers for a subset of questions. These changes are required to correct the erroneous rating during previous evaluation.
 **Model Developers** [Tenyx Research](https://www.tenyx.com/research)
 # Model details
+- Model type: Fine-tuned 70B Instruct model for chat.
 - License: Meta Llama 3 Community License
 - Base model: [Llama3-70B](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct)
 - Demo: Coming Soon!
 ## MT-Bench
+MT-Bench is a benchmark made up of 80 high-quality multi-turn questions. These questions fall into eight categories: Writing, Roleplay, Reasoning, Math, Coding, Extraction, STEM, and Humanities. The chat models are rated using `GPT-4-preview-0125` on a scale of 1 to 10, with higher values corresponding to better responses.
+| Model-name                     | GPT4-preview-0125 MT Bench | Chat Arena Elo |
 |--------------------------------|----------------------------|----------------|
 | GPT-4-1106                     | 8.79                       | 1251           |
 | Claude 3 Opus (20240229)       | 8.57                       | 1247           |