This model is based on Langboat/mengzi-t5-base and pre-trained on 11 Chinese dialogue datasets. It is trained on 8 Tesla A100 for 2 days. To load this model: import torch from transformers import T5Tokenizer from transformers import T5ForConditionalGeneration from collections import OrderedDict model_path = 'Langboat/mengzi-t5-base' model = T5ForConditionalGeneration.from_pretrained(model_path) tokenizer = T5Tokenizer.from_pretrained(model_path) ckp_path = 'mengzi-t5-base-chinese-dialogue/pytorch_model.ckpt' ckpt = torch.load(ckp_path, map_location="cpu") old_state_dict = ckpt['state_dict'] new_state_dict = OrderedDict() for k, v in old_state_dict.items() : new_state_dict[k.replace('model.', '')] = v model.load_state_dict(new_state_dict, strict=False)