Qwen
/

Qwen-14B-Chat-Int4

@@ -16,7 +16,7 @@ inference: false
 <br>
 <p align="center">
-        🤗 <a href="https://huggingface.co/Qwen">Hugging Face</a>&nbsp&nbsp | &nbsp&nbsp🤖 <a href="https://modelscope.cn/models/qwen">ModelScope</a>&nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://arxiv.org/abs/2309.16609">Paper</a>&nbsp&nbsp ｜ &nbsp&nbsp🖥️ <a href="https://modelscope.cn/studios/qwen/Qwen-7B-Chat-Demo/summary">Demo</a>
 <br>
 <a href="https://github.com/QwenLM/Qwen/blob/main/assets/wechat.png">WeChat (微信)</a>&nbsp&nbsp ｜ &nbsp&nbsp DingTalk (钉钉) &nbsp&nbsp | &nbsp&nbsp<a href="https://discord.gg/z3GAxXZ9Ce">Discord</a>&nbsp&nbsp
 </p>
@@ -26,11 +26,11 @@ inference: false
 **通义千问-14B（Qwen-14B）**是阿里云研发的通义千问大模型系列的140亿参数规模的模型。Qwen-14B是基于Transformer的大语言模型, 在超大规模的预训练数据上进行训练得到。预训练数据类型多样，覆盖广泛，包括大量网络文本、专业书籍、代码等。同时，在Qwen-14B的基础上，我们使用对齐机制打造了基于大语言模型的AI助手Qwen-14B-Chat。本仓库为Qwen-14B-Chat的Int4量化模型的仓库。
-如果您想了解更多关于通义千问-14B开源模型的细节，我们建议您参阅[Github代码库](https://github.com/QwenLM/Qwen)。
 **Qwen-14B** is the 14B-parameter version of the large language model series, Qwen (abbr. Tongyi Qianwen), proposed by Alibaba Cloud. Qwen-14B is a Transformer-based large language model, which is pretrained on a large volume of data, including web texts, books, codes, etc. Additionally, based on the pretrained Qwen-14B, we release Qwen-14B-Chat, a large-model-based AI assistant, which is trained with alignment techniques. This repository is the one for the Int4 quantized model of Qwen-14B-Chat.
-For more details about the open-source model of Qwen-14B, please refer to the [Github](https://github.com/QwenLM/Qwen) code repository.
 <br>
@@ -56,15 +56,14 @@ pip install transformers==4.32.0 accelerate tiktoken einops scipy transformers_s
 pip install auto-gptq optimum
 ```
-另外，推荐安装`flash-attention`库，以实现更高的效率和更低的显存占用。
-In addition, it is recommended to install the `flash-attention` library for higher efficiency and lower memory usage.
 ```bash
-git clone -b v1.0.8 https://github.com/Dao-AILab/flash-attention
 cd flash-attention && pip install .
 # 下方安装可选，安装可能比较缓慢。
-# Below are optional. Installing them might be slow.
 # pip install csrc/layer_norm
 # pip install csrc/rotary
 ```
@@ -94,9 +93,9 @@ print(response)
 # 你好！很高兴为你提供帮助。
 ```
-关于更多的使用说明，请参考我们的[Github repo](https://github.com/QwenLM/Qwen)获取更多信息。
-For more information, please refer to our [Github repo](https://github.com/QwenLM/Qwen) for more information.
 <br>
@@ -567,6 +566,22 @@ Qwen-Chat also has the capability to be used as a [HuggingFace Agent](https://hu
 If you meet problems, please refer to [FAQ](https://github.com/QwenLM/Qwen/blob/main/FAQ.md) and the issues first to search a solution before you launch a new issue.
 <br>
 ## 使用协议（License Agreement）
 我们的代码和模型权重对学术研究完全开放，并支持商用。请查看[LICENSE](https://github.com/QwenLM/Qwen/blob/main/LICENSE)了解具体的开源协议细节。如需商用，请填写[问卷](https://dashscope.console.aliyun.com/openModelApply/qianwen)申请。

 <br>
 <p align="center">
+        🤗 <a href="https://huggingface.co/Qwen">Hugging Face</a>&nbsp&nbsp | &nbsp&nbsp🤖 <a href="https://modelscope.cn/organization/qwen">ModelScope</a>&nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://arxiv.org/abs/2309.16609">Paper</a>&nbsp&nbsp ｜ &nbsp&nbsp🖥️ <a href="https://modelscope.cn/studios/qwen/Qwen-7B-Chat-Demo/summary">Demo</a>
 <br>
 <a href="https://github.com/QwenLM/Qwen/blob/main/assets/wechat.png">WeChat (微信)</a>&nbsp&nbsp ｜ &nbsp&nbsp DingTalk (钉钉) &nbsp&nbsp | &nbsp&nbsp<a href="https://discord.gg/z3GAxXZ9Ce">Discord</a>&nbsp&nbsp
 </p>
 **通义千问-14B（Qwen-14B）**是阿里云研发的通义千问大模型系列的140亿参数规模的模型。Qwen-14B是基于Transformer的大语言模型, 在超大规模的预训练数据上进行训练得到。预训练数据类型多样，覆盖广泛，包括大量网络文本、专业书籍、代码等。同时，在Qwen-14B的基础上，我们使用对齐机制打造了基于大语言模型的AI助手Qwen-14B-Chat。本仓库为Qwen-14B-Chat的Int4量化模型的仓库。
+如果您想了解更多关于通义千问-14B开源模型的细节，我们建议您参阅[GitHub代码库](https://github.com/QwenLM/Qwen)。
 **Qwen-14B** is the 14B-parameter version of the large language model series, Qwen (abbr. Tongyi Qianwen), proposed by Alibaba Cloud. Qwen-14B is a Transformer-based large language model, which is pretrained on a large volume of data, including web texts, books, codes, etc. Additionally, based on the pretrained Qwen-14B, we release Qwen-14B-Chat, a large-model-based AI assistant, which is trained with alignment techniques. This repository is the one for the Int4 quantized model of Qwen-14B-Chat.
+For more details about the open-source model of Qwen-14B, please refer to the [GitHub](https://github.com/QwenLM/Qwen) code repository.
 <br>
 pip install auto-gptq optimum
 ```
+另外，推荐安装`flash-attention`库（**当前已支持flash attention 2**），以实现更高的效率和更低的显存占用。
+In addition, it is recommended to install the `flash-attention` library (**we support flash attention 2 now.**) for higher efficiency and lower memory usage.
 ```bash
+git clone https://github.com/Dao-AILab/flash-attention
 cd flash-attention && pip install .
 # 下方安装可选，安装可能比较缓慢。
 # pip install csrc/layer_norm
 # pip install csrc/rotary
 ```
 # 你好！很高兴为你提供帮助。
 ```
+关于更多的使用说明，请参考我们的[GitHub repo](https://github.com/QwenLM/Qwen)获取更多信息。
+For more information, please refer to our [GitHub repo](https://github.com/QwenLM/Qwen) for more information.
 <br>
 If you meet problems, please refer to [FAQ](https://github.com/QwenLM/Qwen/blob/main/FAQ.md) and the issues first to search a solution before you launch a new issue.
 <br>
+## 引用 (Citation)
+如果你觉得我们的工作对你有帮助，欢迎引用！
+If you find our work helpful, feel free to give us a cite.
+```
+@article{qwen,
+  title={Qwen Technical Report},
+  author={Jinze Bai and Shuai Bai and Yunfei Chu and Zeyu Cui and Kai Dang and Xiaodong Deng and Yang Fan and Wenbin Ge and Yu Han and Fei Huang and Binyuan Hui and Luo Ji and Mei Li and Junyang Lin and Runji Lin and Dayiheng Liu and Gao Liu and Chengqiang Lu and Keming Lu and Jianxin Ma and Rui Men and Xingzhang Ren and Xuancheng Ren and Chuanqi Tan and Sinan Tan and Jianhong Tu and Peng Wang and Shijie Wang and Wei Wang and Shengguang Wu and Benfeng Xu and Jin Xu and An Yang and Hao Yang and Jian Yang and Shusheng Yang and Yang Yao and Bowen Yu and Hongyi Yuan and Zheng Yuan and Jianwei Zhang and Xingxuan Zhang and Yichang Zhang and Zhenru Zhang and Chang Zhou and Jingren Zhou and Xiaohuan Zhou and Tianhang Zhu},
+  journal={arXiv preprint arXiv:2309.16609},
+  year={2023}
+}
+```
+<br>
 ## 使用协议（License Agreement）
 我们的代码和模型权重对学术研究完全开放，并支持商用。请查看[LICENSE](https://github.com/QwenLM/Qwen/blob/main/LICENSE)了解具体的开源协议细节。如需商用，请填写[问卷](https://dashscope.console.aliyun.com/openModelApply/qianwen)申请。