metadata

license: apache-2.0
language:
  - zh
library_name: transformers
pipeline_tag: text-generation
inference:
  parameters:
    temperature: 0.7
    top_p: 0.6
    repetition_penalty: 1.1
    max_new_tokens: 128
    num_return_sequences: 3
    do_sample: true
tags:
  - art
widget:
  - 笔底江山助磅礴
  - (唐诗：秋思)风
  - (宋词：浣溪沙)秋
  - (对联)冬

Chinese Poem and Couplt small GPT2 Model

Model description

The model is used to generate Chinese ancient poems and couplets. It is based on the IDEA-CCNL/Wenzhong-GPT2-110M

How to use

You can use the model directly with a pipeline for text generation:

When the parameter skip_special_tokens is True:

>>> from transformers import BertTokenizer, GPT2LMHeadModel,TextGenerationPipeline
>>> tokenizer = BertTokenizer.from_pretrained("snzhang/GPT2-Poem-Small")
>>> model = GPT2LMHeadModel.from_pretrained("snzhang/GPT2-Poem-Small")
>>> text_generator = TextGenerationPipeline(model, tokenizer)   
>>> text_generator("笔底江山助磅礴", max_length=50, do_sample=True)
    [{'generated_text':'笔底江山助磅礴，万卷诗书见成章。'}]

And you can add the prefix "(唐诗：your title)"、"(宋词：your title)" and "(对联)" to make generation more precise.

Training data

Training data contains 71,334 Chinese ancient poems and couplets which are collected by Chinese Poetry and Couplet Dataset

More Details

You can get more details in GPT2-Poem-Small