Update README.md
Browse files
README.md
CHANGED
@@ -135,24 +135,24 @@ Phi doesn't like device_map = auto, therefore you should specify as like the fol
|
|
135 |
|
136 |
1. FP16 / Flash-Attention / CUDA:
|
137 |
```python
|
138 |
-
model = AutoModelForCausalLM.from_pretrained("SE6446/Phasmid-
|
139 |
```
|
140 |
2. FP16 / CUDA:
|
141 |
```python
|
142 |
-
model = AutoModelForCausalLM.from_pretrained("SE6446/Phasmid-
|
143 |
```
|
144 |
3. FP32 / CUDA:
|
145 |
```python
|
146 |
-
model = AutoModelForCausalLM.from_pretrained("SE6446/Phasmid-
|
147 |
```
|
148 |
4. FP32 / CPU:
|
149 |
```python
|
150 |
-
model = AutoModelForCausalLM.from_pretrained("SE6446/Phasmid-
|
151 |
```
|
152 |
|
153 |
And then use the following snippet
|
154 |
```python
|
155 |
-
tokenizer = AutoTokenizer.from_pretrained("SE6446/Phasmid-
|
156 |
inputs = tokenizer('''SYSTEM: You are a helpful assistant. Please answer truthfully and politely. {custom_prompt}\n
|
157 |
USER: {{userinput}}\n
|
158 |
ASSISTANT: {{character name if applicable}}:''', return_tensors="pt", return_attention_mask=False)
|
|
|
135 |
|
136 |
1. FP16 / Flash-Attention / CUDA:
|
137 |
```python
|
138 |
+
model = AutoModelForCausalLM.from_pretrained("SE6446/Phasmid-2_v2", torch_dtype="auto", flash_attn=True, flash_rotary=True, fused_dense=True, device_map="cuda", trust_remote_code=True)
|
139 |
```
|
140 |
2. FP16 / CUDA:
|
141 |
```python
|
142 |
+
model = AutoModelForCausalLM.from_pretrained("SE6446/Phasmid-2_v2", torch_dtype="auto", device_map="cuda", trust_remote_code=True)
|
143 |
```
|
144 |
3. FP32 / CUDA:
|
145 |
```python
|
146 |
+
model = AutoModelForCausalLM.from_pretrained("SE6446/Phasmid-2_v2", torch_dtype=torch.float32, device_map="cuda", trust_remote_code=True)
|
147 |
```
|
148 |
4. FP32 / CPU:
|
149 |
```python
|
150 |
+
model = AutoModelForCausalLM.from_pretrained("SE6446/Phasmid-2_v2", torch_dtype=torch.float32, device_map="cpu", trust_remote_code=True)
|
151 |
```
|
152 |
|
153 |
And then use the following snippet
|
154 |
```python
|
155 |
+
tokenizer = AutoTokenizer.from_pretrained("SE6446/Phasmid-2_v2", trust_remote_code=True, torch_dtype="auto")
|
156 |
inputs = tokenizer('''SYSTEM: You are a helpful assistant. Please answer truthfully and politely. {custom_prompt}\n
|
157 |
USER: {{userinput}}\n
|
158 |
ASSISTANT: {{character name if applicable}}:''', return_tensors="pt", return_attention_mask=False)
|