Add Swiss German adapter
Browse files- README.md +22 -2
- config.json +3 -2
- pytorch_model.bin +2 -2
README.md
CHANGED
@@ -5,6 +5,7 @@ language:
|
|
5 |
- fr
|
6 |
- it
|
7 |
- rm
|
|
|
8 |
- multilingual
|
9 |
inference: false
|
10 |
---
|
@@ -19,6 +20,9 @@ In addition, we used a Switzerland-specific subword vocabulary.
|
|
19 |
|
20 |
The pre-training code and usage examples are available [here](https://github.com/ZurichNLP/swissbert). We also release a version that was fine-tuned on named entity recognition (NER): https://huggingface.co/ZurichNLP/swissbert-ner
|
21 |
|
|
|
|
|
|
|
22 |
## Languages
|
23 |
|
24 |
SwissBERT contains the following language adapters:
|
@@ -29,6 +33,7 @@ SwissBERT contains the following language adapters:
|
|
29 |
| 1 | `fr_CH` | French |
|
30 |
| 2 | `it_CH` | Italian |
|
31 |
| 3 | `rm_CH` | Romansh Grischun |
|
|
|
32 |
|
33 |
## License
|
34 |
Attribution-NonCommercial 4.0 International (CC BY-NC 4.0).
|
@@ -87,6 +92,10 @@ SwissBERT is not designed for generating text.
|
|
87 |
- Training data: German, French, Italian and Romansh documents in the [Swissdox@LiRI](https://t.uzh.ch/1hI) database, until 2022.
|
88 |
- Training procedure: Masked language modeling
|
89 |
|
|
|
|
|
|
|
|
|
90 |
## Environmental Impact
|
91 |
- Hardware type: RTX 2080 Ti.
|
92 |
- Hours used: 10 epochs × 18 hours × 8 devices = 1440 hours
|
@@ -95,7 +104,7 @@ SwissBERT is not designed for generating text.
|
|
95 |
- Carbon efficiency: 0.0016 kg CO2e/kWh ([source](https://t.uzh.ch/1rU))
|
96 |
- Carbon emitted: 0.6 kg CO2e ([source](https://mlco2.github.io/impact#compute))
|
97 |
|
98 |
-
##
|
99 |
```bibtex
|
100 |
@article{vamvas-etal-2023-swissbert,
|
101 |
title={Swiss{BERT}: The Multilingual Language Model for Switzerland},
|
@@ -106,4 +115,15 @@ SwissBERT is not designed for generating text.
|
|
106 |
primaryClass={cs.CL},
|
107 |
url={https://arxiv.org/abs/2303.13310}
|
108 |
}
|
109 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5 |
- fr
|
6 |
- it
|
7 |
- rm
|
8 |
+
- gsw
|
9 |
- multilingual
|
10 |
inference: false
|
11 |
---
|
|
|
20 |
|
21 |
The pre-training code and usage examples are available [here](https://github.com/ZurichNLP/swissbert). We also release a version that was fine-tuned on named entity recognition (NER): https://huggingface.co/ZurichNLP/swissbert-ner
|
22 |
|
23 |
+
## Update 2024-01: Support for Swiss German
|
24 |
+
We added a Swiss German adapter to the model.
|
25 |
+
|
26 |
## Languages
|
27 |
|
28 |
SwissBERT contains the following language adapters:
|
|
|
33 |
| 1 | `fr_CH` | French |
|
34 |
| 2 | `it_CH` | Italian |
|
35 |
| 3 | `rm_CH` | Romansh Grischun |
|
36 |
+
| 4 | `gsw` | Swiss German |
|
37 |
|
38 |
## License
|
39 |
Attribution-NonCommercial 4.0 International (CC BY-NC 4.0).
|
|
|
92 |
- Training data: German, French, Italian and Romansh documents in the [Swissdox@LiRI](https://t.uzh.ch/1hI) database, until 2022.
|
93 |
- Training procedure: Masked language modeling
|
94 |
|
95 |
+
The Swiss German adapter was trained on the following two datasets of written Swiss German:
|
96 |
+
1. [SwissCrawl](https://icosys.ch/swisscrawl) ([Linder et al., LREC 2020](https://aclanthology.org/2020.lrec-1.329)), a collection of Swiss German web text (forum discussions, social media).
|
97 |
+
2. A custom dataset of Swiss German tweets
|
98 |
+
|
99 |
## Environmental Impact
|
100 |
- Hardware type: RTX 2080 Ti.
|
101 |
- Hours used: 10 epochs × 18 hours × 8 devices = 1440 hours
|
|
|
104 |
- Carbon efficiency: 0.0016 kg CO2e/kWh ([source](https://t.uzh.ch/1rU))
|
105 |
- Carbon emitted: 0.6 kg CO2e ([source](https://mlco2.github.io/impact#compute))
|
106 |
|
107 |
+
## Citations
|
108 |
```bibtex
|
109 |
@article{vamvas-etal-2023-swissbert,
|
110 |
title={Swiss{BERT}: The Multilingual Language Model for Switzerland},
|
|
|
115 |
primaryClass={cs.CL},
|
116 |
url={https://arxiv.org/abs/2303.13310}
|
117 |
}
|
118 |
+
```
|
119 |
+
|
120 |
+
Swiss German adapter:
|
121 |
+
```bibtex
|
122 |
+
@inproceedings{vamvas-etal-2024-modular,,
|
123 |
+
title={Modular Adaptation of Multilingual Encoders to Written Swiss German Dialect},
|
124 |
+
author={Jannis Vamvas and No{\"e}mi Aepli and Rico Sennrich},
|
125 |
+
booktitle={First Workshop on Modular and Open Multilingual NLP},
|
126 |
+
year={2024},
|
127 |
+
}
|
128 |
+
```
|
129 |
+
|
config.json
CHANGED
@@ -18,7 +18,8 @@
|
|
18 |
"de_CH",
|
19 |
"fr_CH",
|
20 |
"it_CH",
|
21 |
-
"rm_CH"
|
|
|
22 |
],
|
23 |
"layer_norm_eps": 1e-05,
|
24 |
"ln_before_adapter": true,
|
@@ -30,7 +31,7 @@
|
|
30 |
"position_embedding_type": "absolute",
|
31 |
"pre_norm": false,
|
32 |
"torch_dtype": "float32",
|
33 |
-
"transformers_version": "4.
|
34 |
"type_vocab_size": 1,
|
35 |
"use_cache": true,
|
36 |
"vocab_size": 50262
|
|
|
18 |
"de_CH",
|
19 |
"fr_CH",
|
20 |
"it_CH",
|
21 |
+
"rm_CH",
|
22 |
+
"gsw"
|
23 |
],
|
24 |
"layer_norm_eps": 1e-05,
|
25 |
"ln_before_adapter": true,
|
|
|
31 |
"position_embedding_type": "absolute",
|
32 |
"pre_norm": false,
|
33 |
"torch_dtype": "float32",
|
34 |
+
"transformers_version": "4.33.2",
|
35 |
"type_vocab_size": 1,
|
36 |
"use_cache": true,
|
37 |
"vocab_size": 50262
|
pytorch_model.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:3621abd43ac00e35367a180626eccb4091493178ed6f922fc78717e2a4c06fed
|
3 |
+
size 640768013
|