Edit model card

nhn_dpo_v3_nox-solar-10.7b-v4_DPO

Our Team

Youjin Chung
Jingyeom Kim

Model

Base Model

davidkim205/nox-solar-10.7b-v4

Hardware and Software

Hardware: A100 * 8 for training our model
Deepspeed library & Huggingface TRL Trainer

Dataset

DPO_dataset
- 자체 제작 dpo dataset(AI-hub dataset 활용)
- OpenOrca DPO 등 영어 데이터셋 번역(ENERGY-DRINK-LOVE/translate_share_gpt_dedup_llama_SFT_1024, 자체모델 활용)

Training Method

Benchmark

Ko LM Eval Harness

0 shot (macro f1)

kobest_boolq	kobest_copa	kobest_hellaswag	kobest_sentineg
0.931613	0.740751	0.468602	0.488465

Downloads last month: 2,384

Safetensors

Model size

10.7B params

Tensor type

BF16

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for ENERGY-DRINK-LOVE/nox_DPOv3

Base model

davidkim205/nox-solar-10.7b-v4

Finetuned

(3)

this model

Spaces using ENERGY-DRINK-LOVE/nox_DPOv3 5

Evaluation results

Metadata error: specify a dataset to view leaderboard