--- language: - zh tags: - songnet - pytorch - zh - Text2Text-Generation license: "apache-2.0" widget: - text: "丹枫江冷人初去" --- # SongNet for Chinese Couplet(songnet-base-chinese-couplet) Model SongNet中文对联生成模型 `songnet-base-chinese-couplet` evaluate couplet test data: The overall performance of T5 on couplet **test**: |input_text|target_text|pred| |:--- |:--- |:-- | |春回大地,对对黄莺鸣暖树|日照神州,群群紫燕衔新泥|福至人间,家家紫燕舞和风| 在Couplet测试集上生成结果满足字数相同、词性对齐、词面对齐、形似要求,针对性的SongNet网络结构,在语义对仗工整和平仄合律上的效果明显优于T5和GPT2等模型。 SongNet的网络结构: ![arch](songnet-network.png) ## Usage 本项目开源在文本生成项目:[textgen](https://github.com/shibing624/textgen),可支持SongNet模型,通过如下命令调用: Install package: ```shell pip install -U textgen ``` ```python import sys sys.path.append('..') from textgen.language_modeling import SongNetModel model = SongNetModel(model_type='songnet', model_name='songnet-base-chinese-couplet') sentences = [ "严蕊如梦令道是梨花不是。道是杏花不是。白白与红红,别是东风情味。曾记。曾记。人在武陵微醉。", "一句相思吟岁月千杯美酒醉风情", "几树梅花数竿竹一潭秋水半屏山" "未舍东江开口咏且施妙手点睛来", "一去二三里烟村四五家", ] print("inputs:", sentences) print("outputs:", model.generate(sentences)) sentences = [ "一句____月千杯美酒__情", "一去二三里烟村__家亭台__座八__枝花", ] print("inputs:", sentences) print("outputs:", model.fill_mask(sentences)) ``` 模型文件组成: ``` t5-chinese-couplet ├── pytorch_model.bin └── vocab.txt ``` ### 训练数据集 #### 中文对联数据集 - 数据:[对联github](https://github.com/wb14123/couplet-dataset)、[清洗过的对联github](https://github.com/v-zich/couplet-clean-dataset) - 相关内容 - [Huggingface](https://huggingface.co/) - [SongNet paper](https://aclanthology.org/2020.acl-main.68/) - [textgen](https://github.com/shibing624/textgen) 数据格式: ```text head -n 1 couplet_files/couplet/train/in.txt 晚 风 摇 树 树 还 挺 head -n 1 couplet_files/couplet/train/out.txt 晨 露 润 花 花 更 红 ``` 如果需要训练SongNet模型,请参考[https://github.com/shibing624/textgen/blob/main/examples/language_generation/training_zh_songnet_demo.py](https://github.com/shibing624/textgen/blob/main/examples/language_generation/training_zh_songnet_demo.py) ## Citation ```latex @software{textgen, author = {Xu Ming}, title = {textgen: Implementation of Text Generation models}, year = {2022}, url = {https://github.com/shibing624/textgen}, } ```