File size: 976 Bytes
78aa4ee
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
# English to Urhobo

Author: iroro orife

## Data

	- The JW300 English-Urhobo dataset.

## Model

- Default Masakhane Transformer translation model.
- [Link to google drive folder with models](https://drive.google.com/open?id=1-0REUw5fg_Y13wrKgE9RFD_iljOsXykr)

## Analysis

The dataset requires more preprocessing to remove special characters and Scripture chapters/verse names & figures. This will make the model more generally useful outside of religious text translations.

Example 1
```sh
	Source:     But freedom from what ?
	Reference:  Ẹkẹvuọvo , ẹdia vọ yen egbomọphẹ na che si ayen nu ?
	Hypothesis: ( 1 Pita 3 : 1 ) Ẹkẹvuọvo , die yen egbomọphẹ 
```

Example 2
```sh
	Source:     Today he is serving at Bethel .
	Reference:  Nonẹna , ọ ga vwẹ Bẹtẹl .
	Hypothesis: Nonẹna , ọ ga vwẹ Bẹtẹl asaọkiephana .
```

# Results

Tokenization | BLEU dev | BLEU test
--- | --- | ---
BPE| 15.91 | 28.82
Word-level | 11.80  | 22.39