chinese-kenlm-klm / README.md
shibing624's picture
Update README.md
8c30ebb
metadata
language:
  - zh
tags:
  - chatglm
  - pytorch
  - zh
  - Text2Text-Generation
license: bigscience-bloom-rail-1.0
widget:
  - text: |-
      对下面中文拼写纠错:
      少先队员因该为老人让坐。
      答:

Chinese language Model(kenlm)

kenlm language model:

  • big model: zh_giga.no_cna_cmn.prune01244.klm
  • small model: people2014corpus_chars.klm

Usage

本项目开源在 pycorrector 项目:pycorrector,可支持kenlm模型,通过如下命令调用:

Install package:

pip install -U pycorrector
from pycorrector import Corrector
model = Corrector(language_model_path='people2014corpus_chars.klm')
print(model.correct('少先队员因该为老人让坐')) # ['少先队员应该为老人让座。']

如果需要训练文本纠错模型,请参考https://github.com/shibing624/pycorrector

Citation

@software{pycorrector,
  author = {Ming Xu},
  title = {pycorrector: Text Error Correction Tool},
  year = {2023},
  url = {https://github.com/shibing624/pycorrector},
}