Smaller version of Starcoder
StarCoder is indeed the state-of-the-art from my using experience on several tasks.
But the 15.5B model is too large for some personal use case.
Do you consider pre-train and release smaller versions, say 3B, 7B. I would really appreciate that.
try to load_in_8bit?
https://huggingface.co/docs/transformers/main_classes/quantization
There is a TinyStarCoder: https://huggingface.co/bigcode/tiny_starcoder_py, but it seems too small (164M) to have a good performance
Yes we'll probably release smaller checkpoints in the range of 3B-7B in the upcoming months.
it would be great to have smaller checkpoint in 3B-7B range.
1B, 3B and 7B models were released few months ago
Thanks. found these:
1B: https://huggingface.co/bigcode/starcoderbase-1b
3B: https://huggingface.co/bigcode/starcoderbase-3b
7B: https://huggingface.co/bigcode/starcoderbase-7b
are the megatron weights planned to be released as well?
Or is there a way to convert these to Megatron, so we could finetune using bigCode/Megatron-LM ?