SO0529
commited on
Commit
•
7bea384
1
Parent(s):
39f56ae
modify: Readme contents
Browse files
README.md
CHANGED
@@ -76,7 +76,7 @@ for gen_text in tokenizer.batch_decode(gen_tokens, skip_special_tokens=True):
|
|
76 |
The model was trained on [Japanese CC-100](http://data.statmt.org/cc-100/ja.txt.xz), [Japanese Wikipedia](https://dumps.wikimedia.org/other/cirrussearch), and [Japanese OSCAR](https://huggingface.co/datasets/oscar).
|
77 |
|
78 |
# Tokenization
|
79 |
-
The model uses a [special sub-word tokenizer](https://github.com/tanreinama/Japanese-BPEEncoder_V2). Please refer the original repository or [GPT-
|
80 |
|
81 |
# Licenese
|
82 |
[The MIT license](https://opensource.org/licenses/MIT)
|
|
|
76 |
The model was trained on [Japanese CC-100](http://data.statmt.org/cc-100/ja.txt.xz), [Japanese Wikipedia](https://dumps.wikimedia.org/other/cirrussearch), and [Japanese OSCAR](https://huggingface.co/datasets/oscar).
|
77 |
|
78 |
# Tokenization
|
79 |
+
The model uses a [special sub-word tokenizer](https://github.com/tanreinama/Japanese-BPEEncoder_V2). Please refer the original repository or [GPT-NoeX-Japanese](https://huggingface.co/docs/transformers/model_doc/gpt_neox_japanese) in detail.
|
80 |
|
81 |
# Licenese
|
82 |
[The MIT license](https://opensource.org/licenses/MIT)
|