beyond
/

genius-large

Text2Text Generation

conditional text generation

sketch-based text generation

data augmentation

Inference Endpoints

Model card Files Files and versions Community

genius-large / README.md

beyond's picture

Update README.md

b77717c about 2 years ago

|

2.82 kB

	---
	language: en
	tags:
	- data augmentation
	- keywords-to-text generation
	- sketch-to-text generation
	license: apache-2.0
	datasets:
	- C4


	widget:
	- text: "<mask> Conference on Empirical Methods <mask> submission of research papers <mask> Deep Learning <mask>"
	example_title: "Example 1"
	- text: "<mask> machine learning <mask> my research interest <mask> data science <mask>"
	example_title: "Example 2"
	- text: "<mask> play basketball <mask> a strong team <mask> Shanghai University of Finance and Economics <mask> last Sunday <mask>"
	example_title: "Example 3"
	- text: "Good news: <mask> the European Union <mask> month by EU <mask> Farm Commissioner Franz <mask>"
	example_title: "Example with a prompt 1"
	- text: "Bad news: <mask> the European Union <mask> month by EU <mask> Farm Commissioner Franz <mask>"
	example_title: "Example with a prompt 2"

	inference:
	parameters:
	max_length: 200
	num_beams: 3
	do_sample: True
	---

	# SEGA-large model

	SEGA: SkEtch-based Generative Augmentation

	SEGA is a general text augmentation model that can be used for data augmentation for various NLP tasks (including sentiment analysis, topic classification, NER, and QA). SEGA uses an encoder-decoder structure (based on the BART architecture) and is pre-trained on the C4-realnewslike corpus.

	- Paper: [this paper](to_be_added)
	- Github: [this repository](to_be_added).



	### How to use
	```python
	from transformers import pipeline
	# 1. load the model with the huggingface `pipeline`
	sega = pipeline("text2text-generation", model='beyond/sega-large', device=0)
	# 2. provide a sketch (joint by <mask> tokens)
	sketch = "<mask> Conference on Empirical Methods <mask> submission of research papers <mask> Deep Learning <mask>"
	# 3. just do it!
	generated_text = sega(sketch, num_beams=3, do_sample=True, max_length=200)[0]['generated_text']
	print(generated_text)
	```

	```shell
	'The Conference on Empirical Methods welcomes the submission of research papers. Abstracts should be in the form of a paper or presentation. Please submit abstracts to the following email address: eemml.stanford.edu. The conference will be held at Stanford University on April 1618, 2019. The theme of the conference is Deep Learning.'
	```

	## Model variations


	\| Model \| #params \| Language \|
	\|------------------------\|--------------------------------\|-------\|
	\| [`sega-large`]() \| xM \| English \|
	\| [`sega-base`]() \| xM \| English \|
	\| [`sega-small`]() \| xM \| English \|
	\| [`sega-large-chinese`]() \| xM \| Chinese \|
	\| [`sega-base-chinese`]() \| xM \| Chinese \|
	\| [`sega-small-chinese`]() \| xM \| Chinese \|


	## Intended uses & limitations






	### Limitations and bias


	## Training data


	## Training procedure

	### Preprocessing


	### Pretraining

	## Evaluation results



	### BibTeX entry and citation info