how to convert megatron model to huggingface?
#6
by
cdj0311
- opened
hi,
I want convert megatron model (trained by myself with bigcode-project/Megatron-LM repo) to huggingface format, can you provide a script to convert it?
You can clone this repo: https://github.com/bigcode-project/transformers/ and use this code
You can clone this repo: https://github.com/bigcode-project/transformers/ and use this code
Thanks,
But this code just support 1-way tensor/pipeline parallelism model converted, how to convert when tensor/pipline parallelism > 1?
you need to merge the partitions with Megatron-LM before the conversion, with something like this
python Megatron-LM/tools/checkpoint_util.py \
--model-type GPT \
--load-dir CKPT_DIR \
--save-dir OUTPUT_PATH \
--target-tensor-parallel-size 1 \
--target-pipeline-parallel-size 1
cdj0311
changed discussion status to
closed