malteos's picture
Update README.md
6961d2f
|
raw
history blame
853 Bytes
---
license: mit
language: de
widget:
- text: "In einer schockierenden Entdeckung fanden Wissenschaftler eine Herde Einhörner, die in einem abgelegenen, zuvor unerforschten Tal in den Anden lebten."
---
# Replication of [gpt2-wechsel-german](https://huggingface.co/benjamin/gpt2-wechsel-german)
- trained with [BigScience's DeepSpeed-Megatron-LM code base](https://github.com/bigscience-workshop/Megatron-DeepSpeed)
- 22hrs on 4xA100 GPUs (~ 80 TFLOPs / GPU)
- stopped after 100k steps
- less than a single epoch on `oscar_unshuffled_deduplicated_de` (excluding validation set; original model was trained for 75 epochs on less data)
- bf16
- zero stage 1
- tp/pp = 1
## Evaluation
| Model | PPL |
|---|---|
| `gpt2-wechsel-german-ds-meg` | **26.4** |
| `gpt2-wechsel-german` | 26.8 |
| `gpt2` (retrained from scratch) | 27.63 |
## License
MIT