genius-large / README.md
beyond's picture
Update README.md
2d73f34
|
raw
history blame
1.95 kB
metadata
language: en
tags:
  - augmentation
license: apache-2.0
datasets:
  - C4
widget:
  - text: >-
      <mask> Conference on Empirical Methods <mask> submission of research
      papers <mask> Deep Learning <mask>
    example_title: Example 1
  - text: >-
      <mask> machine learning <mask> my research interest <mask> data science
      <mask>
    example_title: Example 2
  - text: >-
      <mask> play basketball <mask> a strong team <mask> Shanghai University of
      Finance and Economics <mask> last Sunday <mask>
    example_title: Example 3
  - text: >-
      Good news: <mask> the European Union <mask> month by EU <mask> Farm
      Commissioner Franz <mask>
    example_title: Example with a prompt 1
  - text: >-
      Bad news: <mask> the European Union <mask> month by EU <mask> Farm
      Commissioner Franz <mask>
    example_title: Example with a prompt 2
inference:
  parameters:
    max_new_tokens: 200
    top_k: 3
    do_sample: true

SEGA-large model

SEGA: SkEtch-based Generative Augmentation

SEGA is a general text augmentation model that can be used for data augmentation for various NLP tasks (including sentiment analysis, topic classification, NER, and QA). SEGA uses an encoder-decoder structure (based on the BART architecture) and is pre-trained on the C4-realnewslike corpus.

Model description

Model variations

Model #params Language
sega-large xM English
sega-base xM English
sega-small xM English
sega-large-chinese xM Chinese
sega-base-chinese xM Chinese
sega-small-chinese xM Chinese

Intended uses & limitations

How to use

Limitations and bias

Training data

Training procedure

Preprocessing

Pretraining

Evaluation results

BibTeX entry and citation info