KBLab
/

megatron-bert-large-swedish-cased-165k

Inference Endpoints

Model card Files Files and versions Community

robinq commited on Apr 5, 2022

Commit

fd7a2f3

•

1 Parent(s): f518dbe

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -10,7 +10,7 @@ This BERT model was trained using the Megatron-LM library.
 The size of the model is a regular BERT-large with 340M parameters.
 The model was trained on about 70GB of data, consisting mostly of OSCAR and Swedish newspaper text curated by the National Library of Sweden.
-Training was done for 110k training steps using a batch size of 8k; the number of training steps is set to 500k, meaning that this version is a checkpoint.
 The hyperparameters for training followed the setting for RoBERTa.

 The size of the model is a regular BERT-large with 340M parameters.
 The model was trained on about 70GB of data, consisting mostly of OSCAR and Swedish newspaper text curated by the National Library of Sweden.
+Training was done for 165k training steps using a batch size of 8k; the number of training steps is set to 500k, meaning that this version is a checkpoint.
 The hyperparameters for training followed the setting for RoBERTa.