File size: 3,295 Bytes
6e903d6
8664b40
 
 
6e903d6
8664b40
 
6e903d6
d637b1f
8664b40
 
89c7899
 
6e903d6
8664b40
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
539ce50
8664b40
 
 
 
 
 
 
1e2ada8
 
dbcaade
 
 
1e2ada8
dbcaade
8664b40
 
 
 
 
 
 
 
 
 
 
 
 
 
d637b1f
8664b40
7cb3f94
 
 
 
8664b40
 
 
 
2889544
8664b40
 
 
 
 
 
4fa5173
8664b40
f6a54e6
8664b40
 
a550f61
8664b40
 
a550f61
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
---
language:
- jv
license: apache-2.0
tags:
- text-generation-inference
- transformers
- unsloth
- gemma2
- trl
- sft
datasets:
- afrizalha/Gatra-2-Javanese
---
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Document Title</title>
    <style>
        h1 {
            font-size: 36px;
            color: navy;
            font-family: 'Tahoma';
            text-align: center;
        }
    </style>
</head>
<body>
    <h1> Open models for indigenous Indonesian languages</h1>
</body>
</html>

<center>
    <img src="https://imgur.com/PutckEK.png" alt="Bakpia" width="500" height="250">
    <p><em>Bakpia is a family of open language models capable of responding in Javanese language. Version one of Bakpia is the first generative Javanese LLM gain functional instruction performance using solely synthetic data.</em></p>
    <p><em style="color: black; font-weight: bold;">Beta preview</em></p>
</center>
Bakpia V1 is a family of Javanese language models. It is fine-tuned from available open models using massive synthetic data for Krama Javanese, where the prompts are generated by GPT-4o and the responses are generated by Claude 3 Haiku.

This repository contains the 4bit version of Bakpia V1 9B.

| Version | Base Model | URL | Training |
|---------|------------|-----|----------|
| V1 0.5B | Qwen 2 0.5B Instruct | [fp16](https://huggingface.co/afrizalha/Bakpia-V1-0.5B-Javanese/) | Epoch = 1, Batch = 16\*8, lr = 5e-5, linear schedule|
| V1 1.5B | Qwen 2 1.5B Instruct | [fp16](https://huggingface.co/afrizalha/Bakpia-V1-1.5B-Javanese) | Epoch = 1, Batch = 16\*8, lr = 5e-5, linear schedule|
| V1 9B | Gemma 2 9B Instruct | [fp16](https://huggingface.co/afrizalha/Bakpia-V1-9B-Javanese-fp16)/[4bit](https://huggingface.co/afrizalha/Bakpia-V1-9B-Javanese-4bit/) |Batch size = 16\*8, lr = 4e-5, linear schedule|

Training data is accessible [here](https://huggingface.co/datasets/afrizalha/Gatra-2-Javanese).

## Version 1.0

This is the first version of Bakpia.

✨ Training
- 36K input-output pairs
- 64/128 lora r/alpha
- Rank-stabilized lora

✨ Features
- Single-turn QA across various domains.
- Ngoko Javanese not currently supported.

## Generate with template
```
# Update transformers for Gemma 2 compatibility + install accelerate and bitsandbytes for loading 4bit model 
!pip install -q git+https://github.com/huggingface/transformers.git
!pip install -q accelerate bitsandbytes

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

tokenizer = AutoTokenizer.from_pretrained("afrizalha/Bakpia-V1-9B-Javanese-4bit")
model = AutoModelForCausalLM.from_pretrained("afrizalha/Bakpia-V1-9B-Javanese-4bit", quantization_config=BitsAndBytesConfig(load_in_4bit=True))
model.to("cuda")

template = """<start_of_turn>user
{prompt}<end_of_turn>
<start_of_turn>model
"""

input = template.format(prompt="Kados pundi kulo saged nyinaoni Basa Jawa kanthi sae?")
input = tokenizer([input], return_tensors = "pt").to("cuda")
outputs = model.generate(**input, max_new_tokens = 1024, streamer= TextStreamer(tokenizer), temperature=.5, use_cache=True, do_sample=True)
```

## Acknowledgments

- **Developed by:** Afrizal Hasbi Azizy
- **License:** Apache-2.0