⭐My custom LLM 13B⭐
Model Details
Model Developers
- Kyujin Han (kyujinpy)
Model Architecture
- My custom LLM 13B is an auto-regressive language model based on the LLaMA2 transformer architecture.
Base Model
Training Dataset
Model comparisons
Ko-LLM leaderboard(11/27; link)
Model | Average | Ko-ARC | Ko-HellaSwag | Ko-MMLU | Ko-TruthfulQA | Ko-CommonGen V2 |
---|---|---|---|---|---|---|
⭐My custom LLM 13B-v1⭐ | 50.19 | 45.99 | 56.93 | 41.78 | 41.66 | 64.58 |
⭐My custom LLM 13B-v2⭐ | 48.28 | 45.73 | 56.97 | 38.77 | 38.75 | 61.16 |
⭐My custom LLM 13B-v4⭐ | 49.89 | 45.05 | 57.06 | 41.83 | 42.93 | 62.57 |
Model comparisons2
AI-Harness evaluation; link
Model | Copa | Copa | HellaSwag | HellaSwag | BoolQ | BoolQ | Sentineg | Sentineg |
---|---|---|---|---|---|---|---|---|
0-shot | 5-shot | 0-shot | 5-shot | 0-shot | 5-shot | 0-shot | 5-shot | |
⭐My custom LLM 13B-v1⭐ | 0.7987 | 0.8269 | 0.4994 | 0.5660 | 0.3343 | 0.5060 | 0.6984 | 0.9723 |
⭐My custom LLM 13B-v2⭐ | 0.7938 | 0.8209 | 0.4978 | 0.4893 | 0.3343 | 0.5614 | 0.6283 | 0.9773 |
⭐My custom LLM 13B-v4⭐ | 0.7988 | 0.8279 | 0.4995 | 0.4953 | 0.3343 | 0.3558 | 0.7825 | 0.9698 |
beomi/llama-2-koen-13b | 0.7768 | 0.8128 | 0.4999 | 0.5127 | 0.3988 | 0.7038 | 0.5870 | 0.9748 |
Implementation Code
### KO-Platypus
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
repo = "PracticeLLM/Custom-KoLLM-13B-v4"
OpenOrca = AutoModelForCausalLM.from_pretrained(
repo,
return_dict=True,
torch_dtype=torch.float16,
device_map='auto'
)
OpenOrca_tokenizer = AutoTokenizer.from_pretrained(repo)
Hyperparameters
- learning_rate: 4e-4
- batch_size: 16
- epoch: 1
- lora_target_modules: [gate_proj, down_proj, up_proj, q_proj, k_proj, v_proj, o_proj]
- cutoff_len: 4096
- Downloads last month
- 4,124
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.