metadata
license: apache-2.0
Model
base_model : beomi/OPEN-SOLAR-KO-10.7B
Dataset
- 공개 데이터 수집
- Deduplicating Training Data Makes Language Models Better 알고리즘 활용
Code
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_name = "jingyeom/SOLAR_KO_1.3_deup"
model = AutoModelForCausalLM.from_pretrained(
model_name,
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
Benchmark
Ko-LLM-Leaderboard (24.01.29 기준 리더보드 11등)
Average | Ko-ARC | Ko-HellaSwag | Ko-MMLU | Ko-TruthfulQA | Ko-CommonGen V2 |
---|---|---|---|---|---|
53.63 | 52.65 | 60.92 | 50.9 | 45.14 | 58.56 |