summary
This model is bigcode/starcoder fine-tuned on codegen dataset & natural language dataset(chinese/english instruction dataset)
dataset
- codegen-instruct
- zirui3/TSSB-3M-instructions(python code bugfix)
- FLAN(english)
- OIG (Open-Assistant,engliesh)
- zirui3/zhihu_qa(chinese)
- COIG (chinese)
- pCLUE(chinese)
- zirui3/cMedQA2-instructions (chinese medical domain)