Edit model card

image/png

T3Q-LLM-sft1.0-dpo1.0

This model is a version of T3Q-LLM/T3Q-LLM-solar10.8-sft-v1.0 that has been fine-tuned with DPO.

Model Developers Chihoon Lee(chihoonlee10), T3Q

Prompt Template

A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
Human: {prompt}
Assistant:

How to Use it

from transformers import AutoTokenizer
from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("T3Q-LLM/T3Q-LLM-sft1.0-dpo1.0")
tokenizer = AutoTokenizer.from_pretrained("T3Q-LLM/T3Q-LLM-sft1.0-dpo1.0")

prompt_template = "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.\nHuman: {prompt}\nAssistant:\n"
text = 'ํ•œ๊ตญ์˜ ์ˆ˜๋„๋Š” ์–ด๋””์ธ๊ฐ€์š”? ์•„๋ž˜ ์„ ํƒ์ง€ ์ค‘ ๊ณจ๋ผ์ฃผ์„ธ์š”.\n\n(A) ๊ฒฝ์„ฑ\n(B) ๋ถ€์‚ฐ\n(C) ํ‰์–‘\n(D) ์„œ์šธ\n(E) ์ „์ฃผ'
model_inputs = tokenizer(prompt_template.format(prompt=text), return_tensors='pt')

outputs = model.generate(**model_inputs, max_new_tokens=256)
output_text = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
print(output_text)

Example Output

A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
Human: ํ•œ๊ตญ์˜ ์ˆ˜๋„๋Š” ์–ด๋””์ธ๊ฐ€์š”? ์•„๋ž˜ ์„ ํƒ์ง€ ์ค‘ ๊ณจ๋ผ์ฃผ์„ธ์š”.

(A) ๊ฒฝ์„ฑ
(B) ๋ถ€์‚ฐ
(C) ํ‰์–‘
(D) ์„œ์šธ
(E) ์ „์ฃผ
Assistant:
(D) ์„œ์šธ์ด ํ•œ๊ตญ์˜ ์ˆ˜๋„์ž…๋‹ˆ๋‹ค. ์„œ์šธ์€ ๋‚˜๋ผ์˜ ๋ถ๋™๋ถ€์— ์œ„์น˜ํ•ด ์žˆ์œผ๋ฉฐ, ์ •์น˜, ๊ฒฝ์ œ, ๋ฌธํ™”์˜ ์ค‘์‹ฌ์ง€์ž…๋‹ˆ๋‹ค. ์•ฝ 1,000๋งŒ ๋ช…์ด ๋„˜๋Š” ์ธ๊ตฌ๋ฅผ ๊ฐ€์ง„ ์„ธ๊ณ„์—์„œ ๊ฐ€์žฅ ํฐ ๋„์‹œ ์ค‘ ํ•˜๋‚˜์ž…๋‹ˆ๋‹ค. ์„œ์šธ์€ ๋†’์€ ๋นŒ๋”ฉ, ํ˜„๋Œ€์ ์ธ ์ธํ”„๋ผ, ํ™œ๊ธฐ ๋ฌธํ™” ์žฅ๋ฉด์œผ๋กœ ์œ ๋ช…ํ•ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ, ๋งŽ์€ ์—ญ์‚ฌ์  ๋ช…์†Œ์™€ ๋ฐ•๋ฌผ๊ด€์ด ์žˆ์–ด ๋ฐฉ๋ฌธ๊ฐ๋“ค์—๊ฒŒ ํ’๋ถ€ํ•œ ๋ฌธํ™” ์ฒดํ—˜์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
Task Version Metric Value Stderr
kobest_boolq 0 acc 0.9387 ยฑ 0.0064
macro_f1 0.9387 ยฑ 0.0064
kobest_copa 0 acc 0.7590 ยฑ 0.0135
macro_f1 0.7585 ยฑ 0.0135
kobest_hellaswag 0 acc 0.5080 ยฑ 0.0224
acc_norm 0.5580 ยฑ 0.0222
macro_f1 0.5049 ยฑ 0.0224
kobest_sentineg 0 acc 0.8489 ยฑ 0.0180
macro_f1 0.8483 ยฑ 0.0180

hf-causal-experimental (pretrained=nlpai-lab/KULLM3,use_accelerate=true,trust_remote_code=true), limit: None, provide_description: False, num_fewshot: 0, batch_size: 8

Task Version Metric Value Stderr
kobest_boolq 0 acc 0.8896 ยฑ 0.0084
macro_f1 0.8888 ยฑ 0.0084
kobest_copa 0 acc 0.6930 ยฑ 0.0146
macro_f1 0.6925 ยฑ 0.0147
kobest_hellaswag 0 acc 0.4640 ยฑ 0.0223
acc_norm 0.5240 ยฑ 0.0224
macro_f1 0.4612 ยฑ 0.0223
kobest_sentineg 0 acc 0.6297 ยฑ 0.0243
macro_f1 0.6255 ยฑ 0.0244
Downloads last month
1,770
Safetensors
Model size
10.8B params
Tensor type
BF16
ยท
Inference Examples
Inference API (serverless) is not available, repository is disabled.

Model tree for T3Q-LLM/T3Q-LLM-sft1.0-dpo1.0

Adapters
2 models

Dataset used to train T3Q-LLM/T3Q-LLM-sft1.0-dpo1.0