Large Multilingual Model

by stevenhillis - opened

Any plans to release an mdeberta-v3-large model? I'd really like to see a large multilingual variant!

I'm unclear on whether mdeberta-v3-base was trained on all of CC100's languages or some subset (there are only 16 specifically mentioned languages). Could someone please clarify?

@Peterr According to the paper, the model was trained on 2.5T CC100 multi-lingual dataset which is the same as XLM-R, so that would be all C100 languages. However, the model is evaluated on XNLI dataset which has 16 languages.

