yinsong1986
commited on
Commit
•
1cd6a22
1
Parent(s):
2ae0e2b
Update README.md
Browse files
README.md
CHANGED
@@ -15,7 +15,11 @@ MistralLight evolves from [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mis
|
|
15 |
|
16 |
## Motivation of Developing MistralLite
|
17 |
|
18 |
-
Since the release of [Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1), the model became increasingly popular because its strong performance
|
|
|
|
|
|
|
|
|
19 |
|
20 |
### [Topic Retrieval](https://lmsys.org/blog/2023-06-29-longchat/) ###
|
21 |
|Model Name|Input length| Input length | Input length| Input length| Input length|
|
|
|
15 |
|
16 |
## Motivation of Developing MistralLite
|
17 |
|
18 |
+
Since the release of [Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1), the model became increasingly popular because its strong performance
|
19 |
+
on a wide range of benchmarks. But most of the benchmarks are evaluated on `short context`, and not much has been investigated on its performance on long context tasks.
|
20 |
+
Then We evaluated `Mistral-7B-Instruct-v0.1` against benchmarks that are specifically designed to assess the capabilities of LLMs in handling longer context.
|
21 |
+
Although the performance of the models on long context was fairly competitive on long context less than 4096 tokens,
|
22 |
+
there were some limitations on its performance on longer context. Motivated by improving its performance on longer context, we finetuned the Mistral 7B model, and got `Mistrallite`. The model managed to `signifantly boost the performance of long context handling` over Mistral-7B-Instruct-v0.1. The detailed `long context evalutaion results` are as below:
|
23 |
|
24 |
### [Topic Retrieval](https://lmsys.org/blog/2023-06-29-longchat/) ###
|
25 |
|Model Name|Input length| Input length | Input length| Input length| Input length|
|