add text-generation pipeline example with autocast
Browse files
README.md
CHANGED
@@ -132,6 +132,22 @@ from transformers import AutoTokenizer
|
|
132 |
tokenizer = AutoTokenizer.from_pretrained('EleutherAI/gpt-neox-20b')
|
133 |
```
|
134 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
135 |
## Model Description
|
136 |
|
137 |
The architecture is a modification of a standard decoder-only transformer.
|
|
|
132 |
tokenizer = AutoTokenizer.from_pretrained('EleutherAI/gpt-neox-20b')
|
133 |
```
|
134 |
|
135 |
+
The model can then be used, for example, within a text-generation pipeline.
|
136 |
+
Note: when running Torch modules in lower precision, it is best practice to use the [torch.autocast context manager](https://pytorch.org/docs/stable/amp.html).
|
137 |
+
|
138 |
+
```python
|
139 |
+
from transformers import pipeline
|
140 |
+
|
141 |
+
pipe = pipeline('text-generation', model=model, tokenizer=tokenizer, device='cuda:0')
|
142 |
+
|
143 |
+
with torch.autocast('cuda', dtype=torch.bfloat16):
|
144 |
+
print(
|
145 |
+
pipe('Here is a recipe for vegan banana bread:\n',
|
146 |
+
max_new_tokens=100,
|
147 |
+
do_sample=True,
|
148 |
+
use_cache=True))
|
149 |
+
```
|
150 |
+
|
151 |
## Model Description
|
152 |
|
153 |
The architecture is a modification of a standard decoder-only transformer.
|