metadata

language: en
tags:
  - augmentation
license: apache-2.0
datasets:
  - C4
widget:
  - text: >-
      <mask> machine learning <mask> my reserach interest <mask> data science
      <mask>

SEGA-large model

SEGA: SkEtch-based Generative Augmentation

SEGA is a general text augmentation model that can be used for data augmentation for various NLP tasks (including sentiment analysis, topic classification, NER, and QA). SEGA uses an encoder-decoder structure (based on the BART architecture) and is pre-trained on the C4-realnewslike corpus.

Paper: this paper
Github: this repository.

Model description

Model variations

Model	#params	Language
`sega-large`	xM	English
`sega-base`	xM	English
`sega-small`	xM	English
`sega-large-chinese`	xM	Chinese
`sega-base-chinese`	xM	Chinese
`sega-small-chinese`	xM	Chinese

beyond
/

genius-large

SEGA-large model

Model description

Model variations

Intended uses & limitations

How to use

Limitations and bias

Training data

Training procedure

Preprocessing

Pretraining

Evaluation results

BibTeX entry and citation info