Update README.md
Browse files
README.md
CHANGED
@@ -1,8 +1,18 @@
|
|
1 |
---
|
2 |
license: mit
|
3 |
---
|
4 |
-
**ALMA** (**A**dvanced **L**anguage **M**odel-based tr**A**nslator) is an LLM-based translation model, which adopts a new translation model paradigm: it begins with fine-tuning on monolingual data and is further optimized using high-quality parallel data. This two-step fine-tuning process ensures
|
5 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
6 |
We release four translation models presented in the paper:
|
7 |
- **ALMA-7B**: Full-weight Fine-tune LLaMA-2-7B on 20B monolingual tokens and then **Full-weight** fine-tune on human-written parallel data
|
8 |
- **ALMA-7B-LoRA**: Full-weight Fine-tune LLaMA-2-7B on 20B monolingual tokens and then **LoRA** fine-tune on human-written parallel data
|
|
|
1 |
---
|
2 |
license: mit
|
3 |
---
|
4 |
+
**ALMA** (**A**dvanced **L**anguage **M**odel-based tr**A**nslator) is an LLM-based translation model, which adopts a new translation model paradigm: it begins with fine-tuning on monolingual data and is further optimized using high-quality parallel data. This two-step fine-tuning process ensures strong translation performance.
|
5 |
+
Please find more details in our [paper](https://arxiv.org/abs/2309.11674).
|
6 |
+
```
|
7 |
+
@misc{xu2023paradigm,
|
8 |
+
title={A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models},
|
9 |
+
author={Haoran Xu and Young Jin Kim and Amr Sharaf and Hany Hassan Awadalla},
|
10 |
+
year={2023},
|
11 |
+
eprint={2309.11674},
|
12 |
+
archivePrefix={arXiv},
|
13 |
+
primaryClass={cs.CL}
|
14 |
+
}
|
15 |
+
```
|
16 |
We release four translation models presented in the paper:
|
17 |
- **ALMA-7B**: Full-weight Fine-tune LLaMA-2-7B on 20B monolingual tokens and then **Full-weight** fine-tune on human-written parallel data
|
18 |
- **ALMA-7B-LoRA**: Full-weight Fine-tune LLaMA-2-7B on 20B monolingual tokens and then **LoRA** fine-tune on human-written parallel data
|