metadata
language:
- ko
- en
license: cc-by-nc-sa-4.0
library_name: transformers
Llama3-Chat_Vector-kor_Instruct
I have implemented a Korean LLAMA3 model referring to the models created by Beomi
Chat-Vector Paper(https://arxiv.org/abs/2310.04799)
Reference Models:
- meta-llama/Meta-Llama-3-8B(https://huggingface.co/meta-llama/Meta-Llama-3-8B)
- meta-llama/Meta-Llama-3-8B-Instruct(https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct)
- beomi/Llama-3-KoEn-8B(https://huggingface.co/beomi/Llama-3-KoEn-8B)
Citation
@misc {Llama3-Chat_Vector-kor_Instruct,
author = { {nebchi} },
title = { Llama3-Chat_Vector-kor_Instruct },
year = 2024,
url = { https://huggingface.co/nebchi/Llama3-Chat_Vector-kor_llava },
publisher = { Hugging Face }
}
Running the model on GPU
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline, TextStreamer
import torch
tokenizer = AutoTokenizer.from_pretrained(
"nebchi/Llama3-Chat_Vector-kor",
)
model = AutoModelForCausalLM.from_pretrained(
"nebchi/Llama3-Chat_Vector-kor",
torch_dtype=torch.bfloat16,
device_map='auto',
)
streamer = TextStreamer(tokenizer)
messages = [
{"role": "system", "content": "λΉμ μ μΈκ³΅μ§λ₯ μ΄μμ€ν΄νΈμ
λλ€. 묻λ λ§μ μΉμ νκ³ μ ννκ² λ΅λ³νμΈμ."},
{"role": "user", "content": "λνλ―Όκ΅μ μλμ λν΄ μλ €μ€"},
]
input_ids = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt"
).to(model.device)
terminators = [
tokenizer.eos_token_id,
tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
outputs = model.generate(
input_ids,
max_new_tokens=512,
eos_token_id=terminators,
do_sample=False,
repetition_penalty=1.05,
streamer = streamer
)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))
results
λνλ―Όκ΅μ μλλ μμΈνΉλ³μμ
λλ€.
μμΈνΉλ³μμλ μ²μλ, κ΅νμμ¬λΉ, λλ²μ λ± λνλ―Όκ΅μ μ£Όμ μ λΆκΈ°κ΄μ΄ μμΉν΄ μμ΅λλ€.
λν μμΈμλ λνλ―Όκ΅μ κ²½μ , λ¬Έν, κ΅μ‘, κ΅ν΅μ μ€μ¬μ§λ‘μ¨ λνλ―Όκ΅μ μλμ΄μ λν λμμ
λλ€.μ κ° λμμ΄ λμκΈΈ λ°λλλ€. λ κΆκΈν μ μ΄ μμΌμλ©΄ μΈμ λ μ§ λ¬Όμ΄λ³΄μΈμ!