Minami-su commited on
Commit
7d96ca0
1 Parent(s): f13ee8a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -4
README.md CHANGED
@@ -22,15 +22,25 @@ This model is converted with https://github.com/Minami-su/character_AI_open/blob
22
 
23
  ## special
24
 
25
- Before using this model, you need to modify modeling_mistral.py in transformers library
26
- vim /root/anaconda3/envs/train/lib/python3.9/site-packages/transformers/models/mistral/modeling_mistral.py
27
- find MistralAttention,
28
- modify q,k,v,o bias=False ----->, bias=config.attention_bias
 
 
 
 
29
  Before:
30
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62d7f90b102d144db4b4245b/AKj_fwEoLUKWZ4mViYW-q.png)
31
  After:
32
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62d7f90b102d144db4b4245b/A2gSwq9l6Zx8X1qegtgvE.png)
33
 
 
 
 
 
 
 
34
  Usage:
35
 
36
  ```python
 
22
 
23
  ## special
24
 
25
+ 1.Before using this model, you need to modify modeling_mistral.py in transformers library
26
+
27
+ 2.vim /root/anaconda3/envs/train/lib/python3.9/site-packages/transformers/models/mistral/modeling_mistral.py
28
+
29
+ 3.find MistralAttention,
30
+
31
+ 4.modify q,k,v,o bias=False ----->, bias=config.attention_bias
32
+
33
  Before:
34
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62d7f90b102d144db4b4245b/AKj_fwEoLUKWZ4mViYW-q.png)
35
  After:
36
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62d7f90b102d144db4b4245b/A2gSwq9l6Zx8X1qegtgvE.png)
37
 
38
+
39
+ ## Differences between qwen2 mistral and qwen2 llamafy
40
+
41
+ Compared to qwen2 llamafy,qwen2 mistral can use sliding window attention,qwen2 mistral is faster than qwen2 llamafy, and the context length is better
42
+
43
+
44
  Usage:
45
 
46
  ```python