Fill-Mask
Transformers
PyTorch
Safetensors
xmod
jvamvas commited on
Commit
cc5c549
1 Parent(s): c0a8f01

Add Swiss German adapter

Browse files
Files changed (3) hide show
  1. README.md +22 -2
  2. config.json +3 -2
  3. pytorch_model.bin +2 -2
README.md CHANGED
@@ -5,6 +5,7 @@ language:
5
  - fr
6
  - it
7
  - rm
 
8
  - multilingual
9
  inference: false
10
  ---
@@ -19,6 +20,9 @@ In addition, we used a Switzerland-specific subword vocabulary.
19
 
20
  The pre-training code and usage examples are available [here](https://github.com/ZurichNLP/swissbert). We also release a version that was fine-tuned on named entity recognition (NER): https://huggingface.co/ZurichNLP/swissbert-ner
21
 
 
 
 
22
  ## Languages
23
 
24
  SwissBERT contains the following language adapters:
@@ -29,6 +33,7 @@ SwissBERT contains the following language adapters:
29
  | 1 | `fr_CH` | French |
30
  | 2 | `it_CH` | Italian |
31
  | 3 | `rm_CH` | Romansh Grischun |
 
32
 
33
  ## License
34
  Attribution-NonCommercial 4.0 International (CC BY-NC 4.0).
@@ -87,6 +92,10 @@ SwissBERT is not designed for generating text.
87
  - Training data: German, French, Italian and Romansh documents in the [Swissdox@LiRI](https://t.uzh.ch/1hI) database, until 2022.
88
  - Training procedure: Masked language modeling
89
 
 
 
 
 
90
  ## Environmental Impact
91
  - Hardware type: RTX 2080 Ti.
92
  - Hours used: 10 epochs × 18 hours × 8 devices = 1440 hours
@@ -95,7 +104,7 @@ SwissBERT is not designed for generating text.
95
  - Carbon efficiency: 0.0016 kg CO2e/kWh ([source](https://t.uzh.ch/1rU))
96
  - Carbon emitted: 0.6 kg CO2e ([source](https://mlco2.github.io/impact#compute))
97
 
98
- ## Citation
99
  ```bibtex
100
  @article{vamvas-etal-2023-swissbert,
101
  title={Swiss{BERT}: The Multilingual Language Model for Switzerland},
@@ -106,4 +115,15 @@ SwissBERT is not designed for generating text.
106
  primaryClass={cs.CL},
107
  url={https://arxiv.org/abs/2303.13310}
108
  }
109
- ```
 
 
 
 
 
 
 
 
 
 
 
 
5
  - fr
6
  - it
7
  - rm
8
+ - gsw
9
  - multilingual
10
  inference: false
11
  ---
 
20
 
21
  The pre-training code and usage examples are available [here](https://github.com/ZurichNLP/swissbert). We also release a version that was fine-tuned on named entity recognition (NER): https://huggingface.co/ZurichNLP/swissbert-ner
22
 
23
+ ## Update 2024-01: Support for Swiss German
24
+ We added a Swiss German adapter to the model.
25
+
26
  ## Languages
27
 
28
  SwissBERT contains the following language adapters:
 
33
  | 1 | `fr_CH` | French |
34
  | 2 | `it_CH` | Italian |
35
  | 3 | `rm_CH` | Romansh Grischun |
36
+ | 4 | `gsw` | Swiss German |
37
 
38
  ## License
39
  Attribution-NonCommercial 4.0 International (CC BY-NC 4.0).
 
92
  - Training data: German, French, Italian and Romansh documents in the [Swissdox@LiRI](https://t.uzh.ch/1hI) database, until 2022.
93
  - Training procedure: Masked language modeling
94
 
95
+ The Swiss German adapter was trained on the following two datasets of written Swiss German:
96
+ 1. [SwissCrawl](https://icosys.ch/swisscrawl) ([Linder et al., LREC 2020](https://aclanthology.org/2020.lrec-1.329)), a collection of Swiss German web text (forum discussions, social media).
97
+ 2. A custom dataset of Swiss German tweets
98
+
99
  ## Environmental Impact
100
  - Hardware type: RTX 2080 Ti.
101
  - Hours used: 10 epochs × 18 hours × 8 devices = 1440 hours
 
104
  - Carbon efficiency: 0.0016 kg CO2e/kWh ([source](https://t.uzh.ch/1rU))
105
  - Carbon emitted: 0.6 kg CO2e ([source](https://mlco2.github.io/impact#compute))
106
 
107
+ ## Citations
108
  ```bibtex
109
  @article{vamvas-etal-2023-swissbert,
110
  title={Swiss{BERT}: The Multilingual Language Model for Switzerland},
 
115
  primaryClass={cs.CL},
116
  url={https://arxiv.org/abs/2303.13310}
117
  }
118
+ ```
119
+
120
+ Swiss German adapter:
121
+ ```bibtex
122
+ @inproceedings{vamvas-etal-2024-modular,,
123
+ title={Modular Adaptation of Multilingual Encoders to Written Swiss German Dialect},
124
+ author={Jannis Vamvas and No{\"e}mi Aepli and Rico Sennrich},
125
+ booktitle={First Workshop on Modular and Open Multilingual NLP},
126
+ year={2024},
127
+ }
128
+ ```
129
+
config.json CHANGED
@@ -18,7 +18,8 @@
18
  "de_CH",
19
  "fr_CH",
20
  "it_CH",
21
- "rm_CH"
 
22
  ],
23
  "layer_norm_eps": 1e-05,
24
  "ln_before_adapter": true,
@@ -30,7 +31,7 @@
30
  "position_embedding_type": "absolute",
31
  "pre_norm": false,
32
  "torch_dtype": "float32",
33
- "transformers_version": "4.27.1",
34
  "type_vocab_size": 1,
35
  "use_cache": true,
36
  "vocab_size": 50262
 
18
  "de_CH",
19
  "fr_CH",
20
  "it_CH",
21
+ "rm_CH",
22
+ "gsw"
23
  ],
24
  "layer_norm_eps": 1e-05,
25
  "ln_before_adapter": true,
 
31
  "position_embedding_type": "absolute",
32
  "pre_norm": false,
33
  "torch_dtype": "float32",
34
+ "transformers_version": "4.33.2",
35
  "type_vocab_size": 1,
36
  "use_cache": true,
37
  "vocab_size": 50262
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2f7329a691419166342728cda32d7e8a5fd72b83f6b290168d0b3057dd9c51eb
3
- size 612385785
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3621abd43ac00e35367a180626eccb4091493178ed6f922fc78717e2a4c06fed
3
+ size 640768013