|
--- |
|
language: |
|
- en |
|
- zh |
|
- multilingual |
|
license: apache-2.0 |
|
tags: |
|
- GENIUS |
|
- conditional text generation |
|
- sketch-based text generation |
|
- data augmentation |
|
datasets: |
|
- c4 |
|
- beyond/chinese_clean_passages_80m |
|
widget: |
|
- text: <mask> Conference on Empirical Methods <mask> submission of research papers |
|
<mask> Deep Learning <mask> |
|
example_title: Example 1 |
|
- text: <mask> machine learning <mask> my research interest <mask> data science <mask> |
|
example_title: Example 2 |
|
- text: <mask> play basketball <mask> a strong team <mask> Shanghai University of |
|
Finance and Economics <mask> last Sunday <mask> |
|
example_title: Example 3 |
|
- text: 'Good news: <mask> the European Union <mask> month by EU <mask> Farm Commissioner |
|
Franz <mask>' |
|
example_title: Example with a prompt 1 |
|
- text: 'Bad news: <mask> the European Union <mask> month by EU <mask> Farm Commissioner |
|
Franz <mask>' |
|
example_title: Example with a prompt 2 |
|
inference: |
|
parameters: |
|
max_length: 200 |
|
num_beams: 3 |
|
do_sample: true |
|
--- |
|
|
|
# GENIUS: generating text using sketches! |
|
|
|
|
|
- **Paper: [GENIUS: Sketch-based Language Model Pre-training via Extreme and Selective Masking for Text Generation and Augmentation](https://arxiv.org/abs/2211.10330)** |
|
- **GitHub: [GENIUS, Pre-training/Data Augmentation Tutorial](https://github.com/beyondguo/genius)** |
|
|
|
|
|
|
|
|
|
|
|
You can use this model directly with a pipeline for masked language modeling: |
|
|
|
```python |
|
from transformers import pipeline |
|
# 1. load the model with the huggingface `pipeline` |
|
genius = pipeline("text2text-generation", model='beyond/genius-large', device=0) |
|
# 2. provide a sketch (joint by <mask> tokens) |
|
sketch = "<mask> Conference on Empirical Methods <mask> submission of research papers <mask> Deep Learning <mask>" |
|
# 3. here we go! |
|
generated_text = genius(sketch, num_beams=3, do_sample=True, max_length=200)[0]['generated_text'] |
|
print(generated_text) |
|
``` |
|
|
|
If you find our paper/code/demo useful, please cite our paper: |
|
``` |
|
@article{guo2022genius, |
|
title={GENIUS: Sketch-based Language Model Pre-training via Extreme and Selective Masking for Text Generation and Augmentation}, |
|
author={Guo, Biyang and Gong, Yeyun and Shen, Yelong and Han, Songqiao and Huang, Hailiang and Duan, Nan and Chen, Weizhu}, |
|
journal={arXiv preprint arXiv:2211.10330}, |
|
year={2022} |
|
} |
|
``` |