Transformers
Inference Endpoints
Edit model card

cosmo2-tokenizer

Tokenizer for the training of cosmo2. This tokenizer was trained on 1M samples from:

  • FineWeb-Edu 70%
  • Cosmopedia v2 15%
  • StarCoderData 8%
  • OpenWebMath 5%
  • StackOverFlow 2%
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model’s pipeline type. Check the docs .