|
--- |
|
language: |
|
- zh |
|
tags: |
|
- chatglm |
|
- pytorch |
|
- zh |
|
- Text2Text-Generation |
|
license: bigscience-bloom-rail-1.0 |
|
widget: |
|
- text: "对下面中文拼写纠错:\n少先队员因该为老人让坐。\n答:" |
|
|
|
--- |
|
|
|
# Chinese language Model(kenlm) |
|
kenlm language model: |
|
|
|
- big model: zh_giga.no_cna_cmn.prune01244.klm |
|
- small model: people2014corpus_chars.klm |
|
## Usage |
|
|
|
本项目开源在 pycorrector 项目:[pycorrector](https://github.com/shibing624/pycorrector),可支持kenlm模型,通过如下命令调用: |
|
|
|
Install package: |
|
```shell |
|
pip install -U pycorrector |
|
``` |
|
|
|
```python |
|
from pycorrector import Corrector |
|
model = Corrector(language_model_path='people2014corpus_chars.klm') |
|
print(model.correct('少先队员因该为老人让坐')) # ['少先队员应该为老人让座。'] |
|
``` |
|
|
|
如果需要训练文本纠错模型,请参考[https://github.com/shibing624/pycorrector](https://github.com/shibing624/pycorrector) |
|
|
|
|
|
|
|
## Citation |
|
|
|
```latex |
|
@software{pycorrector, |
|
author = {Ming Xu}, |
|
title = {pycorrector: Text Error Correction Tool}, |
|
year = {2023}, |
|
url = {https://github.com/shibing624/pycorrector}, |
|
} |
|
``` |
|
|