|
--- |
|
language: |
|
- jv |
|
license: apache-2.0 |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- qwen2 |
|
- trl |
|
- sft |
|
datasets: |
|
- afrizalha/Gatra-2-Javanese |
|
--- |
|
<!DOCTYPE html> |
|
<html lang="en"> |
|
<head> |
|
<meta charset="UTF-8"> |
|
<meta name="viewport" content="width=device-width, initial-scale=1.0"> |
|
<title>Document Title</title> |
|
<style> |
|
h1 { |
|
font-size: 36px; |
|
color: navy; |
|
font-family: 'Tahoma'; |
|
text-align: center; |
|
} |
|
</style> |
|
</head> |
|
<body> |
|
<h1> Open models for indigenous Indonesian languages</h1> |
|
</body> |
|
</html> |
|
|
|
<center> |
|
<img src="https://imgur.com/PutckEK.png" alt="Bakpia" width="500" height="250"> |
|
<p><em>Bakpia is a family of open language models capable of responding in Javanese language. Version one of Bakpia is the first generative Javanese LLM gain functional instruction performance using solely synthetic data.</em></p> |
|
<p><em style="color: black; font-weight: bold;">Beta preview</em></p> |
|
</center> |
|
Bakpia V1 is a family of Javanese language models. It is fine-tuned from available open models using massive synthetic data for Krama Javanese, where the prompts are generated by GPT-4o and the responses are generated by Claude 3 Haiku. |
|
|
|
This repository contains the fp16 version of Bakpia V1 1.5B. |
|
|
|
| Version | Base Model | URL | Training | |
|
|---------|------------|-----|----------| |
|
| V1 0.5B | Qwen 2 0.5B Instruct | [fp16](huggingface.co/afrizalha/Bakpia-V1-0.5B-Javanese/) | Epoch = 1, Batch = 16\*8, lr = 5e-5, linear schedule| |
|
| V1 1.5B | Qwen 2 1.5B Instruct | [fp16](huggingface.co/afrizalha/Bakpia-V1-1.5B-Javanese/) | Epoch = 1, Batch = 16\*8, lr = 5e-5, linear schedule| |
|
| V1 9B | Gemma 2 9B Instruct | [fp16](huggingface.co/afrizalha/Bakpia-V1-9B-Javanese-fp16)/[4bit](huggingface.co/afrizalha/Bakpia-V1-9B-Javanese-4bit/) |Batch size = 16\*8, lr = 4e-5, linear schedule| |
|
|
|
Training data is accessible here: [URL](https://huggingface.co/datasets/afrizalha/Gatra-2-Javanese) |
|
|
|
## Version 1.0 |
|
|
|
This is the first version of Bakpia. |
|
|
|
✨ Training |
|
- 36K input-output pairs |
|
- 64/128 lora r/alpha |
|
- Rank-stabilized lora |
|
|
|
✨ Features |
|
- Single-turn QA across various domains. |
|
- Ngoko Javanese not currently supported. |
|
|
|
## Generate with template |
|
``` |
|
from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("afrizalha/Bakpia-V1-1.5B-Javanese") |
|
model = AutoModelForCausalLM.from_pretrained("afrizalha/Bakpia-V1-1.5B-Javanese") |
|
model.to("cuda") |
|
|
|
template = """<|im_start|>system |
|
<|im_end|> |
|
<|im_start|>user |
|
{prompt}<|im_end|> |
|
<|im_start|>assistant |
|
""" |
|
|
|
input = template.format(prompt="Kados pundi kulo saged nyinaoni Basa Jawa kanthi sae?") |
|
input = tokenizer([input], return_tensors = "pt").to("cuda") |
|
outputs = model.generate(**input, max_new_tokens = 1024, streamer= TextStreamer(tokenizer), temperature=.5, use_cache=False, do_sample=True) |
|
``` |
|
|
|
## Acknowledgments |
|
|
|
- **Developed by:** Afrizal Hasbi Azizy |
|
- **License:** Apache-2.0 |