atsuki-yamaguchi
commited on
Commit
•
b45f83d
1
Parent(s):
81feac3
Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
@@ -6,7 +6,7 @@ language:
|
|
6 |
base_model: meta-llama/Meta-Llama-3-8B
|
7 |
library_name: transformers
|
8 |
---
|
9 |
-
# Llama3 8B for Burmese: 100 target vocabulary size + Mean target vocabulary initialization +
|
10 |
|
11 |
This model is built on top of Llama3 8B adapted for Burmese using 30K target language sentences sampled from CC-100.
|
12 |
|
@@ -14,7 +14,7 @@ This model is built on top of Llama3 8B adapted for Burmese using 30K target lan
|
|
14 |
|
15 |
* **Vocabulary**: This model has an additional 100 target vocabulary.
|
16 |
* **Target vocabulary initialization**: The target weights of the embedding and LM head were initialized using Mean initialization.
|
17 |
-
* **Training**: This model was additionally pre-trained on 30K target language sentences sampled from CC-100. The training was conducted with the
|
18 |
|
19 |
## Model Description
|
20 |
|
|
|
6 |
base_model: meta-llama/Meta-Llama-3-8B
|
7 |
library_name: transformers
|
8 |
---
|
9 |
+
# Llama3 8B for Burmese: 100 target vocabulary size + Mean target vocabulary initialization + 2x2LS/MTP/512 training
|
10 |
|
11 |
This model is built on top of Llama3 8B adapted for Burmese using 30K target language sentences sampled from CC-100.
|
12 |
|
|
|
14 |
|
15 |
* **Vocabulary**: This model has an additional 100 target vocabulary.
|
16 |
* **Target vocabulary initialization**: The target weights of the embedding and LM head were initialized using Mean initialization.
|
17 |
+
* **Training**: This model was additionally pre-trained on 30K target language sentences sampled from CC-100. The training was conducted with the 2x2LS/MTP/512 strategies introduced in the paper.
|
18 |
|
19 |
## Model Description
|
20 |
|