DialoT5 / README.md
Jaren's picture
Update README.md
30caebd
|
raw
history blame
782 Bytes
This model is based on Langboat/mengzi-t5-base and pre-trained on 11 Chinese dialogue datasets. It is trained on 8 Tesla A100 for 2 days.
To load this model:
```python
import torch
from transformers import T5Tokenizer
from transformers import T5ForConditionalGeneration
from collections import OrderedDict
model_path = 'Langboat/mengzi-t5-base'
model = T5ForConditionalGeneration.from_pretrained(model_path)
tokenizer = T5Tokenizer.from_pretrained(model_path)
ckp_path = 'mengzi-t5-base-chinese-dialogue/pytorch_model.ckpt'
ckpt = torch.load(ckp_path, map_location="cpu")
old_state_dict = ckpt['state_dict']
new_state_dict = OrderedDict()
for k, v in old_state_dict.items() :
new_state_dict[k.replace('model.', '')] = v
model.load_state_dict(new_state_dict, strict=False)
```