KBLab
/

megatron-bert-base-swedish-cased-125k

Inference Endpoints

Model card Files Files and versions Community

megatron-bert-base-swedish-cased-125k / README.md

robinq's picture

Update README.md

ae0c98e over 2 years ago

|

858 Bytes

	---
	language:
	- sv

	---

	# Megatron-BERT-base Swedish 125k

	This BERT model was trained using the Megatron-LM library.
	The size of the model is a regular BERT-base with 110M parameters.
	The model was trained on about 70GB of data, consisting mostly of OSCAR and Swedish newspaper text curated by the National Library of Sweden.

	Training was done for 125k training steps. Its [sister model](https://huggingface.co/KBLab/megatron-bert-base-swedish-cased-600k) used the same setup, but was instead trained for 600k steps.


	The model has three sister models trained on the same dataset:
	- [🤗 BERT Swedish](https://huggingface.co/KBLab/bert-base-swedish-cased-new)
	- [Megatron-BERT-base-600k](https://huggingface.co/KBLab/megatron-bert-base-swedish-cased-600k)
	- [Megatron-BERT-large-110k](https://huggingface.co/KBLab/megatron-bert-large-swedish-cased-110k)