Edit model card

Update!

  • [2024.10.08] Bllossom-3B λͺ¨λΈμ΄ 졜초 μ—…λ°μ΄νŠΈ λ˜μ—ˆμŠ΅λ‹ˆλ‹€.

Bllossom | Demo | Homepage | Github |

저희 Bllossom νŒ€μ—μ„œ Bllossom-3B λͺ¨λΈμ„ κ³΅κ°œν•©λ‹ˆλ‹€.
llama3.2-3Bκ°€ λ‚˜μ™”λŠ”λ° ν•œκ΅­μ–΄κ°€ 포함 μ•ˆλ˜μ—ˆλ‹€κ΅¬?? 이번 Bllossom-3BλŠ” ν•œκ΅­μ–΄κ°€ μ§€μ›λ˜μ§€ μ•ŠλŠ” κΈ°λ³Έ λͺ¨λΈμ„ ν•œκ΅­μ–΄-μ˜μ–΄λ‘œ κ°•ν™”λͺ¨λΈμž…λ‹ˆλ‹€.
 - 100% full-tuning으둜 150GB의 μ •μ œλœ ν•œκ΅­μ–΄λ‘œ μΆ”κ°€ μ‚¬μ „ν•™μŠ΅ λ˜μ—ˆμŠ΅λ‹ˆλ‹€. (GPU많이 νƒœμ› μŠ΅λ‹ˆλ‹€)
 - ꡉμž₯히 μ •μ œλœ Instruction Tuning을 μ§„ν–‰ν–ˆμŠ΅λ‹ˆλ‹€.
 - μ˜μ–΄ μ„±λŠ₯을 μ „ν˜€ μ†μƒμ‹œν‚€μ§€ μ•Šμ€ μ™„μ „ν•œ Bilingual λͺ¨λΈμž…λ‹ˆλ‹€.
 - LogicKor κΈ°μ€€ 5Bμ΄ν•˜ 졜고점수λ₯Ό κΈ°λ‘ν–ˆκ³  6점 μ΄ˆλ°˜λŒ€ 점수λ₯Ό λ³΄μž…λ‹ˆλ‹€.
 - Instruction tuning만 μ§„ν–‰ν–ˆμŠ΅λ‹ˆλ‹€. DPO λ“± μ„±λŠ₯ 올릴 λ°©λ²•μœΌλ‘œ νŠœλ‹ν•΄λ³΄μ„Έμš”.
 - MT-Bench, LogicKor λ“± 벀치마크 점수λ₯Ό μž˜λ°›κΈ° μœ„ν•΄ 정닡데이터λ₯Ό ν™œμš©ν•˜κ±°λ‚˜ ν˜Ήμ€ 벀치마크λ₯Ό νƒ€κ²ŸνŒ… ν•΄μ„œ ν•™μŠ΅ν•˜μ§€ μ•Šμ•˜μŠ΅λ‹ˆλ‹€. (ν•΄λ‹Ή 벀치마크 νƒ€κ²ŒνŒ…ν•΄μ„œ ν•™μŠ΅ν•˜λ©΄ 8점도 λ‚˜μ˜΅λ‹ˆλ‹€...)

μ–Έμ œλ‚˜ κ·Έλž¬λ“― ν•΄λ‹Ή λͺ¨λΈμ€ 상업적 이용이 κ°€λŠ₯ν•©λ‹ˆλ‹€.

1. Bllossom은 AAAI2024, NAACL2024, LREC-COLING2024 (ꡬ두) λ°œν‘œλ˜μ—ˆμŠ΅λ‹ˆλ‹€.
2. 쒋은 μ–Έμ–΄λͺ¨λΈ 계속 μ—…λ°μ΄νŠΈ ν•˜κ² μŠ΅λ‹ˆλ‹€!! ν•œκ΅­μ–΄ κ°•ν™”λ₯Όμœ„ν•΄ 곡동 μ—°κ΅¬ν•˜μ‹€λΆ„(νŠΉνžˆλ…Όλ¬Έ) μ–Έμ œλ“  ν™˜μ˜ν•©λ‹ˆλ‹€!! 
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = 'Bllossom/llama-3.2-Korean-Bllossom-3B'

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
instruction = "μ² μˆ˜κ°€ 20개의 연필을 가지고 μžˆμ—ˆλŠ”λ° μ˜ν¬κ°€ μ ˆλ°˜μ„ κ°€μ Έκ°€κ³  λ―Όμˆ˜κ°€ 남은 5개λ₯Ό κ°€μ Έκ°”μœΌλ©΄ μ² μˆ˜μ—κ²Œ 남은 μ—°ν•„μ˜ κ°―μˆ˜λŠ” λͺ‡κ°œμΈκ°€μš”?"

messages = [
    {"role": "user", "content": f"{instruction}"}
    ]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

terminators = [
    tokenizer.convert_tokens_to_ids("<|end_of_text|>"),
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = model.generate(
    input_ids,
    max_new_tokens=1024,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9
)

print(tokenizer.decode(outputs[0][input_ids.shape[-1]:], skip_special_tokens=True))
μ² μˆ˜κ°€ 20개의 연필을 가지고 μžˆμ—ˆκ³  μ˜ν¬κ°€ μ ˆλ°˜μ„ κ°€μ Έκ°€λ©΄, μ˜ν¬κ°€ κ°€μ Έκ°„ μ—°ν•„μ˜ κ°―μˆ˜λŠ” 20 / 2 = 10κ°œμž…λ‹ˆλ‹€.

이제 μ² μˆ˜κ°€ 남은 μ—°ν•„μ˜ 갯수λ₯Ό κ³„μ‚°ν•΄λ³΄κ² μŠ΅λ‹ˆλ‹€. μ˜ν¬κ°€ 10개λ₯Ό κ°€μ Έκ°„ ν›„ μ² μˆ˜κ°€ 남은 μ—°ν•„μ˜ κ°―μˆ˜λŠ” 20 - 10 = 10κ°œμž…λ‹ˆλ‹€.

λ―Όμˆ˜κ°€ 남은 5개λ₯Ό κ°€μ Έκ°”μœΌλ―€λ‘œ, μ² μˆ˜κ°€ 남은 μ—°ν•„μ˜ κ°―μˆ˜λŠ” 10 - 5 = 5κ°œμž…λ‹ˆλ‹€. 

λ”°λΌμ„œ μ² μˆ˜κ°€ 남은 μ—°ν•„μ˜ κ°―μˆ˜λŠ” 5κ°œμž…λ‹ˆλ‹€.

Supported by

  • AICA

Citation

Language Model

@misc{bllossom,
  author = {ChangSu Choi, Yongbin Jeong, Seoyoon Park, InHo Won, HyeonSeok Lim, SangMin Kim, Yejee Kang, Chanhyuk Yoon, Jaewan Park, Yiseul Lee, HyeJin Lee, Younggyun Hahm, Hansaem Kim, KyungTae Lim},
  title = {Optimizing Language Augmentation for Multilingual Large Language Models: A Case Study on Korean},
  year = {2024},
  journal = {LREC-COLING 2024},
  paperLink = {\url{https://arxiv.org/pdf/2403.10882}},
 },
}

Vision-Language Model

@misc{bllossom-V,
  author = {Dongjae Shin, Hyunseok Lim, Inho Won, Changsu Choi, Minjun Kim, Seungwoo Song, Hangyeol Yoo, Sangmin Kim, Kyungtae Lim},
  title = {X-LLaVA: Optimizing Bilingual Large Vision-Language Alignment},
  year = {2024},
  publisher = {GitHub},
  journal = {NAACL 2024 findings},
  paperLink = {\url{https://arxiv.org/pdf/2403.11399}},
 },
}

Contact

Contributor

Downloads last month
306
GGUF
Model size
3.21B params
Architecture
llama

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference API
Unable to determine this model’s pipeline type. Check the docs .