metadata
language:
- jv
license: apache-2.0
tags:
- text-generation-inference
- transformers
- unsloth
- gemma2
- trl
- sft
datasets:
- afrizalha/Gatra-2-Javanese
Open models for indigenous Indonesian languages
Bakpia is a family of open language models capable of responding in Javanese language. Version one of Bakpia is the first generative Javanese LLM gain functional instruction performance using solely synthetic data.
Beta preview
This repository contains the 4bit version of Bakpia V1 9B.
Version | Base Model | URL | Training |
---|---|---|---|
V1 0.5B | Qwen 2 0.5B Instruct | fp16 | Epoch = 1, Batch = 16*8, lr = 5e-5, linear schedule |
V1 1.5B | Qwen 2 1.5B Instruct | fp16 | Epoch = 1, Batch = 16*8, lr = 5e-5, linear schedule |
V1 9B | Gemma 2 9B Instruct | fp16/4bit | Batch size = 16*8, lr = 4e-5, linear schedule |
Training data is accessible here.
Version 1.0
This is the first version of Bakpia.
✨ Training
- 36K input-output pairs
- 64/128 lora r/alpha
- Rank-stabilized lora
✨ Features
- Single-turn QA across various domains.
- Ngoko Javanese not currently supported.
Generate with template
# Update transformers for Gemma 2 compatibility + install accelerate and bitsandbytes for loading 4bit model
!pip install -q git+https://github.com/huggingface/transformers.git
!pip install -q accelerate bitsandbytes
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
tokenizer = AutoTokenizer.from_pretrained("afrizalha/Bakpia-V1-9B-Javanese-4bit")
model = AutoModelForCausalLM.from_pretrained("afrizalha/Bakpia-V1-9B-Javanese-4bit", quantization_config=BitsAndBytesConfig(load_in_4bit=True))
model.to("cuda")
template = """<start_of_turn>user
{prompt}<end_of_turn>
<start_of_turn>model
"""
input = template.format(prompt="Kados pundi kulo saged nyinaoni Basa Jawa kanthi sae?")
input = tokenizer([input], return_tensors = "pt").to("cuda")
outputs = model.generate(**input, max_new_tokens = 1024, streamer= TextStreamer(tokenizer), temperature=.5, use_cache=True, do_sample=True)
Acknowledgments
- Developed by: Afrizal Hasbi Azizy
- License: Apache-2.0