Update README.md
Browse files
README.md
CHANGED
@@ -9,6 +9,7 @@ datasets:
|
|
9 |
- cis-lmu/Glot500
|
10 |
---
|
11 |
|
|
|
12 |
|
13 |
MaLA-500 is a novel large language model designed to cover an extensive range of 534 languages. This model builds upon LLaMA 2 7B and integrates continued pretraining with vocabulary extension, with an expanded vocabulary size of 260,164, and LoRA low-rank adaptation.
|
14 |
|
@@ -18,6 +19,8 @@ MaLA-500 is a novel large language model designed to cover an extensive range of
|
|
18 |
- **Vocabulary Extension:** MaLA-500 boasts an extended vocabulary size of 260,164.
|
19 |
- **Multilingual Proficiency:** Trained on Glot500-c, covering 534 languages.
|
20 |
|
|
|
|
|
21 |
## How to Get Started with the Model
|
22 |
|
23 |
Requirements:
|
|
|
9 |
- cis-lmu/Glot500
|
10 |
---
|
11 |
|
12 |
+
# MaLA-500: Massive Language Adaptation of Large Language Models
|
13 |
|
14 |
MaLA-500 is a novel large language model designed to cover an extensive range of 534 languages. This model builds upon LLaMA 2 7B and integrates continued pretraining with vocabulary extension, with an expanded vocabulary size of 260,164, and LoRA low-rank adaptation.
|
15 |
|
|
|
19 |
- **Vocabulary Extension:** MaLA-500 boasts an extended vocabulary size of 260,164.
|
20 |
- **Multilingual Proficiency:** Trained on Glot500-c, covering 534 languages.
|
21 |
|
22 |
+
Please refer to [our paper](https://arxiv.org/pdf/2401.13303.pdf) for more details.
|
23 |
+
|
24 |
## How to Get Started with the Model
|
25 |
|
26 |
Requirements:
|