yinsong1986
commited on
Commit
•
52529e2
1
Parent(s):
d3c47d7
Update README.md
Browse files
README.md
CHANGED
@@ -5,7 +5,7 @@ inference: false
|
|
5 |
|
6 |
# MistralLite Model
|
7 |
|
8 |
-
MistralLite is a fine-tuned [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) language model, with enhanced capabilities of processing long context (up to 32K tokens). By utilizing an adapted Rotary Embedding and sliding window during fine-tuning, MistralLite is able to **perform
|
9 |
|
10 |
MistralLite is similar to [Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1), and their similarities and differences are summarized below:
|
11 |
|Model|Fine-tuned on long contexts| Max context length| RotaryEmbedding adaptation| Sliding Window Size|
|
@@ -19,7 +19,7 @@ Since the release of [Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai
|
|
19 |
on a wide range of benchmarks. But most of the benchmarks are evaluated on `short context`, and not much has been investigated on its performance on long context tasks.
|
20 |
Then We evaluated `Mistral-7B-Instruct-v0.1` against benchmarks that are specifically designed to assess the capabilities of LLMs in handling longer context.
|
21 |
Although the performance of the models on long context was fairly competitive on long context less than 4096 tokens,
|
22 |
-
there were some limitations on its performance on longer context. Motivated by improving its performance on longer context, we finetuned the Mistral 7B model, and produced `Mistrallite`. The model managed to `
|
23 |
|
24 |
1. [Topic Retrieval](https://lmsys.org/blog/2023-06-29-longchat/)
|
25 |
|Model Name|Input length| Input length | Input length| Input length| Input length|
|
|
|
5 |
|
6 |
# MistralLite Model
|
7 |
|
8 |
+
MistralLite is a fine-tuned [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) language model, with enhanced capabilities of processing long context (up to 32K tokens). By utilizing an adapted Rotary Embedding and sliding window during fine-tuning, MistralLite is able to **perform significantly better on several long context retrieve and answering tasks**, while keeping the simple model structure of the original model. MistralLite is useful for applications such as long context line and topic retrieval, summarization, question-answering, and etc. MistralLite can be deployed on a single AWS `g5.2x` instance with Sagemaker [Huggingface Text Generation Inference (TGI)](https://github.com/huggingface/text-generation-inference) endpoint, making it suitable for applications that require high performance in resource-constrained environments. You can also serve the MistralLite model directly using TGI docker containers. Also, MistralLite supports other ways of serving like [vLLM](https://github.com/vllm-project/vllm), and you can use MistralLite in Python by using the [HuggingFace transformers](https://huggingface.co/docs/transformers/index) and [FlashAttention-2](https://github.com/Dao-AILab/flash-attention) library.
|
9 |
|
10 |
MistralLite is similar to [Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1), and their similarities and differences are summarized below:
|
11 |
|Model|Fine-tuned on long contexts| Max context length| RotaryEmbedding adaptation| Sliding Window Size|
|
|
|
19 |
on a wide range of benchmarks. But most of the benchmarks are evaluated on `short context`, and not much has been investigated on its performance on long context tasks.
|
20 |
Then We evaluated `Mistral-7B-Instruct-v0.1` against benchmarks that are specifically designed to assess the capabilities of LLMs in handling longer context.
|
21 |
Although the performance of the models on long context was fairly competitive on long context less than 4096 tokens,
|
22 |
+
there were some limitations on its performance on longer context. Motivated by improving its performance on longer context, we finetuned the Mistral 7B model, and produced `Mistrallite`. The model managed to `significantly boost the performance of long context handling` over Mistral-7B-Instruct-v0.1. The detailed `long context evalutaion results` are as below:
|
23 |
|
24 |
1. [Topic Retrieval](https://lmsys.org/blog/2023-06-29-longchat/)
|
25 |
|Model Name|Input length| Input length | Input length| Input length| Input length|
|