ufal
/

byt5-small-multilexnorm2021-iden

Text2Text Generation

lexical normalization

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Add multilingual to the language tag

#1

by lbourdois - opened Jan 7, 2023

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

Files changed (1) hide show

README.md +6 -6

README.md CHANGED Viewed

@@ -2,14 +2,14 @@
 language:
 - id
 - en
 datasets:
 - mc4
 - wikipedia
 - multilexnorm
-tags:
-- lexical normalization
-license: apache-2.0
 ---
 # Fine-tuned ByT5-small for MultiLexNorm (Indonesian-English version)
@@ -23,14 +23,14 @@ Our system is based on [ByT5](https://arxiv.org/abs/2105.13626), which we first
 ## How to use
-The model was *not* fine-tuned in a standard sentence-to-sentence setting – instead, it was tailored to the token-to-token definition of MultiLexNorm data. Please refer to [**the interactive demo on Colab notebook**](https://colab.research.google.com/drive/1rxpI8IlKk-D2crFqi2hdzbTBIezqgsCg?usp=sharing) to learn how to use these models.
 ## How to cite
 ```bibtex
 @inproceedings{wnut-ufal,
-  title= "{ÚFAL} at {MultiLexNorm} 2021: Improving Multilingual Lexical Normalization by Fine-tuning {ByT5}",
   author = "Samuel, David and Straka, Milan",
   booktitle = "Proceedings of the 7th Workshop on Noisy User-generated Text (W-NUT 2021)",
   year = "2021",

 language:
 - id
 - en
+- multilingual
+license: apache-2.0
+tags:
+- lexical normalization
 datasets:
 - mc4
 - wikipedia
 - multilexnorm
 ---
 # Fine-tuned ByT5-small for MultiLexNorm (Indonesian-English version)
 ## How to use
+The model was *not* fine-tuned in a standard sentence-to-sentence setting � instead, it was tailored to the token-to-token definition of MultiLexNorm data. Please refer to [**the interactive demo on Colab notebook**](https://colab.research.google.com/drive/1rxpI8IlKk-D2crFqi2hdzbTBIezqgsCg?usp=sharing) to learn how to use these models.
 ## How to cite
 ```bibtex
 @inproceedings{wnut-ufal,
+  title= "{�FAL} at {MultiLexNorm} 2021: Improving Multilingual Lexical Normalization by Fine-tuning {ByT5}",
   author = "Samuel, David and Straka, Milan",
   booktitle = "Proceedings of the 7th Workshop on Noisy User-generated Text (W-NUT 2021)",
   year = "2021",