Minami-su
/

Qwen1.5-0.5B-Chat_mistral

Text Generation

text-generation-inference

Model card Files Files and versions Community

Minami-su commited on Feb 25

Commit

f13ee8a

•

1 Parent(s): 0a60f61

Update README.md

Files changed (1) hide show

README.md +12 -1

README.md CHANGED Viewed

@@ -15,11 +15,22 @@ tags:
 - qwen1.5
 - qwen2
 ---
-This is the Mistral version of [Qwen1.5-7B-Chat](https://huggingface.co/Qwen/Qwen1.5-7B-Chat) model by Alibaba Cloud.
 The original codebase can be found at: (https://github.com/hiyouga/LLaMA-Factory/blob/main/tests/llamafy_qwen.py).
 I have made modifications to make it compatible with qwen1.5.
 This model is converted with https://github.com/Minami-su/character_AI_open/blob/main/mistral_qwen2.py
 Usage:
 ```python

 - qwen1.5
 - qwen2
 ---
+This is the Mistral version of [Qwen1.5-0.5B-Chat](https://huggingface.co/Qwen/Qwen1.5-0.5B-Chat) model by Alibaba Cloud.
 The original codebase can be found at: (https://github.com/hiyouga/LLaMA-Factory/blob/main/tests/llamafy_qwen.py).
 I have made modifications to make it compatible with qwen1.5.
 This model is converted with https://github.com/Minami-su/character_AI_open/blob/main/mistral_qwen2.py
+## special
+Before using this model, you need to modify modeling_mistral.py in transformers library
+vim /root/anaconda3/envs/train/lib/python3.9/site-packages/transformers/models/mistral/modeling_mistral.py
+find MistralAttention,
+modify q,k,v,o bias=False ----->, bias=config.attention_bias
+Before:
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/62d7f90b102d144db4b4245b/AKj_fwEoLUKWZ4mViYW-q.png)
+After:
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/62d7f90b102d144db4b4245b/A2gSwq9l6Zx8X1qegtgvE.png)
 Usage:
 ```python