---
datasets:
- SLPG/Punjabi_Transliteration_Corpus
language:
- pa
metrics:
- bleu
- cer
library_name: fairseq
pipeline_tag: translation
tags:
- punjabi shahmukhi
- punjabi gurmukhi
- transliteration
- punjabi transliteration
- punjabi gur to shahmukhi
- transliteration system
- punjabi transliteration system
---

### Punjabi Gurmukhi to Shahmukhi Transliteration System
Our supervised Punjabi transliteration systems built using unsupervised corpus are bidirectional NMT systems which effectively convert text between Gurmukhi and Shahmukhi scripts. The Gurmukhi-to-Shahmukhi model achieves a 98.1 BLEU score and 99.5% word-level accuracy, while the Shahmukhi-to-Gurmukhi model scores 87.7 BLEU.
## Corpus Details
  - **Total Sentences:** 6.3 million
  - **Domains Covered:** Various domains including CCaligned, ccmatrix, TED, QED, OPUS, TIco,
  Wikimedia, Multicclaigned, Emille, IJCNLP, xlent, and paracrawl.
  - **Test Corpus:** FLORES-101

### Model Details
    - **BLEU Score:** 98.1
    - **Word-level Accuracy:** 99.5%
    - **Character Error Rate (CER):** 99.1%

You may also explore our <u>Shahmukhi-to-Gurmukhi Model</u>  with **BLEU Score** of 87.7 [here](https://huggingface.co/SLPG/Punjabi_Shahmukhi_to_Gurmukhi_Transliteration/).

## Usage
These resources are intended to facilitate research and development in the field of Punjabi
transliteration. They can be used to train new models or improve existing ones, enabling high-quality
transliteration between Gurmukhi and Shahmukhi scripts.

## Citation

**If you use our model, kindly cite our [paper]()**:
```
@article{Shehzadi2024,
  title={Unsupervised Punjabi Corpus and Neural Machine Transliteration
 System},
  author={Shehzadi Ambreen, Sadaf Abdul Rauf, MG Abbas Malik and Muhammad Imran },      journal={Heliyon},
  year={2024},
  note={Under review}
 }

```