File size: 1,482 Bytes
d1eefe9 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 |
---
license: apache-2.0
datasets:
- artemsnegirev/ru-word-games
language:
- ru
metrics:
- exact_match
pipeline_tag: text2text-generation
---
Model was trained on companion [dataset](artemsnegirev/ru-word-games). Minibob guess word from a description modeling well known Alias word game.
```python
from transformers import T5ForConditionalGeneration, T5Tokenizer
prefix = "guess word:"
def predict_word(prompt, model, tokenizer):
prompt = prompt.replace("...", "<extra_id_0>")
prompt = f"{prefix} {prompt}"
input_ids = tokenizer([prompt], return_tensors="pt").input_ids
outputs = model.generate(
input_ids.to(model.device),
num_beams=5,
max_new_tokens=8,
do_sample=False,
num_return_sequences=5
)
candidates = set()
for tokens in outputs:
candidate = tokenizer.decode(tokens, skip_special_tokens=True)
candidate = candidate.strip().lower()
candidates.add(candidate)
return candidates
model_name = "artemsnegirev/minibob"
tokenizer = T5Tokenizer.from_pretrained(model_name)
model = T5ForConditionalGeneration.from_pretrained(model_name)
prompt = "это животное с копытами на нем ездят"
print(predict_word(prompt, model, tokenizer))
# {'верблюд', 'конь', 'коня', 'лошадь', 'пони'}
```
Detailed github-based [tutorial](https://github.com/artemsnegirev/minibob) with pipeline and source code for building Minibob |