amazon
/

MistralLite

Text Generation

text-generation-inference

Model card Files Files and versions Community

chenwuml commited on Oct 16, 2023

Commit

ba4d664

•

1 Parent(s): 1cd6a22

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -19,7 +19,7 @@ Since the release of [Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai
 on a wide range of benchmarks. But most of the benchmarks are evaluated on `short context`, and not much has been investigated on its performance on long context tasks.
 Then We evaluated `Mistral-7B-Instruct-v0.1` against benchmarks that are specifically designed to assess the capabilities of LLMs in handling longer context.
 Although the performance of the models on long context was fairly competitive on long context less than 4096 tokens,
-there were some limitations on its performance on longer context. Motivated by improving its performance on longer context, we finetuned the Mistral 7B model, and got `Mistrallite`. The model managed to `signifantly boost the performance of long context handling` over Mistral-7B-Instruct-v0.1. The detailed `long context evalutaion results` are as below:
 ### [Topic Retrieval](https://lmsys.org/blog/2023-06-29-longchat/) ###
 |Model Name|Input length| Input length | Input length| Input length| Input length|

 on a wide range of benchmarks. But most of the benchmarks are evaluated on `short context`, and not much has been investigated on its performance on long context tasks.
 Then We evaluated `Mistral-7B-Instruct-v0.1` against benchmarks that are specifically designed to assess the capabilities of LLMs in handling longer context.
 Although the performance of the models on long context was fairly competitive on long context less than 4096 tokens,
+there were some limitations on its performance on longer context. Motivated by improving its performance on longer context, we finetuned the Mistral 7B model, and produced `Mistrallite`. The model managed to `signifantly boost the performance of long context handling` over Mistral-7B-Instruct-v0.1. The detailed `long context evalutaion results` are as below:
 ### [Topic Retrieval](https://lmsys.org/blog/2023-06-29-longchat/) ###
 |Model Name|Input length| Input length | Input length| Input length| Input length|