license: other
license_name: qwen
license_link: >-
https://github.com/QwenLM/Qwen/blob/main/Tongyi%20Qianwen%20LICENSE%20AGREEMENT
language:
- en
- zh
library_name: transformers
pipeline_tag: text-generation
inference: false
tags:
- mistral
- qwen
- qwen1.5
- qwen2
This is the Mistral version of Qwen1.5-14B-Chat model by Alibaba Cloud. The original codebase can be found at: (https://github.com/hiyouga/LLaMA-Factory/blob/main/tests/llamafy_qwen.py). I have made modifications to make it compatible with qwen1.5. This model is converted with https://github.com/Minami-su/character_AI_open/blob/main/mistral_qwen2.py
special
1.Before using this model, you need to modify modeling_mistral.py in transformers library
2.vim /root/anaconda3/envs/train/lib/python3.9/site-packages/transformers/models/mistral/modeling_mistral.py
3.find MistralAttention,
4.modify q,k,v,o bias=False ----->, bias=config.attention_bias
Differences between qwen2 mistral and qwen2 llamafy
Compared to qwen2 llamafy,qwen2 mistral can use sliding window attention,qwen2 mistral is faster than qwen2 llamafy, and the context length is better
Usage:
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
tokenizer = AutoTokenizer.from_pretrained("Minami-su/Qwen1.5-14B-Chat_mistral")
model = AutoModelForCausalLM.from_pretrained("Minami-su/Qwen1.5-14B-Chat_mistral", torch_dtype="auto", device_map="auto")
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
messages = [
{"role": "user", "content": "Who are you?"}
]
inputs = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
inputs = inputs.to("cuda")
generate_ids = model.generate(inputs,max_length=32768, streamer=streamer)
Test
load in 4bit
hf-causal (pretrained=Qwen1.5-14B-Chat), limit: None, provide_description: False, num_fewshot: 0, batch_size: 16
| Task |Version| Metric |Value | |Stderr|
|-------------|------:|--------|-----:|---|-----:|
|arc_challenge| 0|acc |0.4437|± |0.0145|
| | |acc_norm|0.4718|± |0.0146|
|truthfulqa_mc| 1|mc1 |0.4468|± |0.0174|
| | |mc2 |0.6310|± |0.0157|
|winogrande | 0|acc |0.6788|± |0.0131|
load in 4bit
hf-causal (pretrained=Qwen1.5-14B-Chat_mistral), limit: None, provide_description: False, num_fewshot: 0, batch_size: 16
| Task |Version| Metric |Value | |Stderr|
|-------------|------:|--------|-----:|---|-----:|
|arc_challenge| 0|acc |0.4445|± |0.0145|
| | |acc_norm|0.4718|± |0.0146|
|truthfulqa_mc| 1|mc1 |0.4468|± |0.0174|
| | |mc2 |0.6310|± |0.0157|
|winogrande | 0|acc |0.6788|± |0.0131|