Update README.md
Browse files
README.md
CHANGED
@@ -15,11 +15,22 @@ tags:
|
|
15 |
- qwen1.5
|
16 |
- qwen2
|
17 |
---
|
18 |
-
This is the Mistral version of [Qwen1.5-
|
19 |
The original codebase can be found at: (https://github.com/hiyouga/LLaMA-Factory/blob/main/tests/llamafy_qwen.py).
|
20 |
I have made modifications to make it compatible with qwen1.5.
|
21 |
This model is converted with https://github.com/Minami-su/character_AI_open/blob/main/mistral_qwen2.py
|
22 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
23 |
Usage:
|
24 |
|
25 |
```python
|
|
|
15 |
- qwen1.5
|
16 |
- qwen2
|
17 |
---
|
18 |
+
This is the Mistral version of [Qwen1.5-0.5B-Chat](https://huggingface.co/Qwen/Qwen1.5-0.5B-Chat) model by Alibaba Cloud.
|
19 |
The original codebase can be found at: (https://github.com/hiyouga/LLaMA-Factory/blob/main/tests/llamafy_qwen.py).
|
20 |
I have made modifications to make it compatible with qwen1.5.
|
21 |
This model is converted with https://github.com/Minami-su/character_AI_open/blob/main/mistral_qwen2.py
|
22 |
|
23 |
+
## special
|
24 |
+
|
25 |
+
Before using this model, you need to modify modeling_mistral.py in transformers library
|
26 |
+
vim /root/anaconda3/envs/train/lib/python3.9/site-packages/transformers/models/mistral/modeling_mistral.py
|
27 |
+
find MistralAttention,
|
28 |
+
modify q,k,v,o bias=False ----->, bias=config.attention_bias
|
29 |
+
Before:
|
30 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/62d7f90b102d144db4b4245b/AKj_fwEoLUKWZ4mViYW-q.png)
|
31 |
+
After:
|
32 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/62d7f90b102d144db4b4245b/A2gSwq9l6Zx8X1qegtgvE.png)
|
33 |
+
|
34 |
Usage:
|
35 |
|
36 |
```python
|