Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
bigscience-catalogue-data-dev
/
byte-level-bpe-tokenizer-no-norm-250k-whitespace-and-eos-regex-alpha-v3-dedup-lines-articles
like
0
Follow
BigScience Catalogue Data Dev
5
Model card
Files
Files and versions
Community
main
byte-level-bpe-tokenizer-no-norm-250k-whitespace-and-eos-regex-alpha-v3-dedup-lines-articles
2 contributors
History:
3 commits
SaulLu
Create README.md
91b871b
over 2 years ago
.gitattributes
Safe
1.23 kB
Add tokenizer
over 2 years ago
README.md
Safe
565 Bytes
Create README.md
over 2 years ago
special_tokens_map.json
Safe
85 Bytes
Add tokenizer
over 2 years ago
tokenizer.json
Safe
14.5 MB
LFS
Add tokenizer
over 2 years ago
tokenizer_config.json
Safe
131 Bytes
Add tokenizer
over 2 years ago