|
This model is based on Langboat/mengzi-t5-base and pre-trained on 11 Chinese dialogue datasets. It is trained on 8 Tesla A100 for 2 days. |
|
|
|
To load this model: |
|
```python |
|
import torch |
|
from transformers import T5Tokenizer |
|
from transformers import T5ForConditionalGeneration |
|
from collections import OrderedDict |
|
|
|
model_path = 'Langboat/mengzi-t5-base' |
|
model = T5ForConditionalGeneration.from_pretrained(model_path) |
|
tokenizer = T5Tokenizer.from_pretrained(model_path) |
|
|
|
ckp_path = 'mengzi-t5-base-chinese-dialogue/pytorch_model.ckpt' |
|
ckpt = torch.load(ckp_path, map_location="cpu") |
|
old_state_dict = ckpt['state_dict'] |
|
new_state_dict = OrderedDict() |
|
for k, v in old_state_dict.items() : |
|
new_state_dict[k.replace('model.', '')] = v |
|
model.load_state_dict(new_state_dict, strict=False) |
|
``` |