Qwen
/

Qwen-7B-Chat

@@ -93,7 +93,7 @@ print(response)
 # 你好！很高兴为你提供帮助。
 # 第二轮对话 2nd dialogue turn
-response, history = model.chat(tokenizer, "给我讲一个年轻人奋斗创业最终取得成功的故事。", history=history)
 print(response)
 # 这是一个关于一个年轻人奋斗创业最终取得成功的故事。
 # 故事的主人公叫李明，他来自一个普通的家庭，父母都是普通的工人。从小，李明就立下了一个目标：要成为一名成功的企业家。
@@ -262,15 +262,15 @@ We introduce NTK-aware interpolation, LogN attention scaling to extend the conte
 #### ReAct Prompting
-千问支持通过 [ReAct Prompting](https://arxiv.org/abs/2210.03629) 调用插件/工具/API。ReAct 也是 [LangChain](https://python.langchain.com/) 框架采用的主要方式之一。在即将开源的、用于评估工具使用能力的自建评测基准上，千问的表现如下：
-Qwen-7B-Chat supports calling plugins/tools/APIs through [ReAct Prompting](https://arxiv.org/abs/2210.03629). ReAct is also one of the main approaches used by the [LangChain](https://python.langchain.com/) framework. In the soon-to-be-released evaluation benchmark for assessing tool usage capabilities, Qwen-7B-Chat's performance is as follows:
 | Model            | Tool Selection (Acc.↑) | Tool Input (Rouge-L↑) | False Positive Error↓ |
 |:-----------------|:----------------------:|:---------------------:|:---------------------:|
 | GPT-4            | 95%                    | **0.90**              | 15%                   |
 | GPT-3.5          | 85%                    | 0.88                  | 75%                   |
-| **Qwen-7B-Chat** | **99%**                | 0.89                  | **8.5%**              |
 > 评测基准中出现的插件均没有出现在千问的训练集中。该基准评估了模型在多个候选插件中选择正确插件的准确率、传入插件的参数的合理性、以及假阳率。假阳率（False Positive）定义：在处理不该调用插件的请求时，错误地调用了插件。
@@ -357,4 +357,3 @@ Our code and checkpoints are open to research purpose, and they are allowed for
 如果你想给我们的研发团队和产品团队留言，请通过邮件（[email protected]）联系我们。
 If you are interested to leave a message to either our research team or product team, feel free to send an email to [email protected].

 # 你好！很高兴为你提供帮助。
 # 第二轮对话 2nd dialogue turn
+response, history = model.chat(tokenizer, "给我讲一个年轻人奋斗创业最终取得成功的故事。", history=history)
 print(response)
 # 这是一个关于一个年轻人奋斗创业最终取得成功的故事。
 # 故事的主人公叫李明，他来自一个普通的家庭，父母都是普通的工人。从小，李明就立下了一个目标：要成为一名成功的企业家。
 #### ReAct Prompting
+千问支持通过 [ReAct Prompting](https://arxiv.org/abs/2210.03629) 调用插件/工具/API。ReAct 也是 [LangChain](https://python.langchain.com/) 框架采用的主要方式之一。在我们开源的、用于评估工具使用能力的评测基准上，千问的表现如下：
+Qwen-7B-Chat supports calling plugins/tools/APIs through [ReAct Prompting](https://arxiv.org/abs/2210.03629). ReAct is also one of the main approaches used by the [LangChain](https://python.langchain.com/) framework. In our evaluation benchmark for assessing tool usage capabilities, Qwen-7B-Chat's performance is as follows:
 | Model            | Tool Selection (Acc.↑) | Tool Input (Rouge-L↑) | False Positive Error↓ |
 |:-----------------|:----------------------:|:---------------------:|:---------------------:|
 | GPT-4            | 95%                    | **0.90**              | 15%                   |
 | GPT-3.5          | 85%                    | 0.88                  | 75%                   |
+| **Qwen-7B-Chat** | **99%**                | 0.89                  | **9.7%**              |
 > 评测基准中出现的插件均没有出现在千问的训练集中。该基准评估了模型在多个候选插件中选择正确插件的准确率、传入插件的参数的合理性、以及假阳率。假阳率（False Positive）定义：在处理不该调用插件的请求时，错误地调用了插件。
 如果你想给我们的研发团队和产品团队留言，请通过邮件（[email protected]）联系我们。
 If you are interested to leave a message to either our research team or product team, feel free to send an email to [email protected].

modeling_qwen.py CHANGED Viewed

@@ -1153,9 +1153,9 @@ class RotaryEmbedding(torch.nn.Module):
                     / self.dim
                 )
             )
-            self._seq_len_cached = seqlen
             self._ntk_alpha_cached = ntk_alpha
-            seq = torch.arange(seqlen, device=self.inv_freq.device)
             freqs = torch.outer(seq.type_as(self.inv_freq), self.inv_freq)
             emb = torch.cat((freqs, freqs), dim=-1)
             from einops import rearrange

                     / self.dim
                 )
             )
+            self._seq_len_cached = max(2 * seqlen, 16)
             self._ntk_alpha_cached = ntk_alpha
+            seq = torch.arange(self._seq_len_cached, device=self.inv_freq.device)
             freqs = torch.outer(seq.type_as(self.inv_freq), self.inv_freq)
             emb = torch.cat((freqs, freqs), dim=-1)
             from einops import rearrange