takomt / README.md
staka's picture
Update models for transformers==4.30.*
e87f0f1
---
language:
- de
- en
- es
- fr
- it
- ja
- ru
- uk
- multilingual
license: cc-by-sa-4.0
tags:
- translation
---
# TakoMT
This is a translation model using Marian-NMT.
For more details, please see [my repository](https://github.com/s-taka/fugumt).
In addition to the data listed in the repository I also used [ParaCrawl](https://paracrawl.eu/).
* source languages: de, en, es, fr, it, ru, uk
* target language: ja
### How to use
This model uses transformers and sentencepiece.
```python
!pip install transformers sentencepiece
```
You can use this model directly with a pipeline:
```python
from transformers import pipeline
tako_translator = pipeline('translation', model='staka/takomt')
tako_translator('This is a cat.')
```
### Eval results
The results of the evaluation using [tatoeba](https://tatoeba.org/ja)(randomly selected 500 sentences) are as follows:
|source |target |BLEU(*1)|
|-------|-------|--------|
|de |ja |27.8 |
|en |ja |28.4 |
|es |ja |32.0 |
|fr |ja |27.9 |
|it |ja |24.3 |
|ru |ja |27.3 |
|uk |ja |29.8 |
(*1) sacrebleu --tokenize ja-mecab