hiroshi-matsuda-rit
commited on
Commit
•
6191180
1
Parent(s):
f8b286f
Update README.md
Browse files
README.md
CHANGED
@@ -8,4 +8,4 @@ datasets:
|
|
8 |
# BERT base Japanese (character-level tokenization with whole word masking, jawiki-20200831)
|
9 |
|
10 |
This pretrained model is almost the same as [cl-tohoku/bert-base-japanese-char-v2](https://huggingface.co/cl-tohoku/bert-base-japanese-char-v2) but do not need `fugashi` or `unidic_lite`.
|
11 |
-
The only difference is in `
|
|
|
8 |
# BERT base Japanese (character-level tokenization with whole word masking, jawiki-20200831)
|
9 |
|
10 |
This pretrained model is almost the same as [cl-tohoku/bert-base-japanese-char-v2](https://huggingface.co/cl-tohoku/bert-base-japanese-char-v2) but do not need `fugashi` or `unidic_lite`.
|
11 |
+
The only difference is in `word_tokenzer_type` property (specify `basic` instead of `mecab`) in `tokenizer_config.json`.
|