Edit model card

nhn_dpo_v3_nox-solar-10.7b-v4_DPO

Our Team

  • Youjin Chung
  • Jingyeom Kim

Model

Base Model

Hardware and Software

  • Hardware: A100 * 8 for training our model
  • Deepspeed library & Huggingface TRL Trainer

Dataset

  • DPO_dataset
    • 자체 μ œμž‘ dpo dataset(AI-hub dataset ν™œμš©)
    • OpenOrca DPO λ“± μ˜μ–΄ 데이터셋 λ²ˆμ—­(ENERGY-DRINK-LOVE/translate_share_gpt_dedup_llama_SFT_1024, 자체λͺ¨λΈ ν™œμš©)

Training Method

Benchmark

Ko LM Eval Harness

0 shot (macro f1)

kobest_boolq kobest_copa kobest_hellaswag kobest_sentineg
0.931613 0.740751 0.468602 0.488465
Downloads last month
2,384
Safetensors
Model size
10.7B params
Tensor type
BF16
Β·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for ENERGY-DRINK-LOVE/nox_DPOv3

Finetuned
(3)
this model

Spaces using ENERGY-DRINK-LOVE/nox_DPOv3 5