Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,36 @@
|
|
1 |
-
---
|
2 |
-
license: llama3
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: llama3
|
3 |
+
datasets:
|
4 |
+
- yuyijiong/Long-Instruction-with-Paraphrasing
|
5 |
+
language:
|
6 |
+
- zh
|
7 |
+
- en
|
8 |
+
library_name: peft
|
9 |
+
pipeline_tag: text-generation
|
10 |
+
---
|
11 |
+
|
12 |
+
# Llama3-8b-chinese-chat-32k
|
13 |
+
|
14 |
+
## 训练方式
|
15 |
+
|
16 |
+
* 使用 NTK-aware 方法扩展上下文长度至32k
|
17 |
+
|
18 |
+
* 以 [shenzhi-wang/Llama3-8B-Chinese-Chat](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat) 为基础
|
19 |
+
在 [Long-Instruction-with-Paraphrasing](https://huggingface.co/datasets/yuyijiong/Long-Instruction-with-Paraphrasing)
|
20 |
+
数据集上,使用 QLora 微调 1 epoch。
|
21 |
+
|
22 |
+
|
23 |
+
## 长上下文表现
|
24 |
+
相比原始版本,拥有更强的长上下文能力
|
25 |
+
|
26 |
+
### LongBench (en)
|
27 |
+
| model | hotpotqa | multifieldqa_en| passage_retrieval_en|qmsum| trec|
|
28 |
+
|---------------------------|-----------|--|--|--|--|
|
29 |
+
| llama3-chinese-8b | 45.88 |50.56|68.0|22.52|73.0|
|
30 |
+
| llama3-8b-chinese-chat-32k| **47.64** |49.98|**100.0**|**25.13**|**75.0**|
|
31 |
+
|
32 |
+
### LongBench (zh)
|
33 |
+
| model | dureader | multifieldqa_zh| passage_retrieval_zh|qmsum| trec|
|
34 |
+
|-----------------------------------|-----------|--|--|--|--|
|
35 |
+
| llama3-8b-chinese-chat | 29.08 |58.4|93.5|22.52|73.0|
|
36 |
+
| llama3-8b-chinese-chat-32k | **32.31** |**58.66**|82.5|**25.13**|**75.0**|
|