nhn_dpo_v3_nox-solar-10.7b-v4_DPO
Our Team
- Youjin Chung
- Jingyeom Kim
Model
Base Model
Hardware and Software
- Hardware: A100 * 8 for training our model
- Deepspeed library & Huggingface TRL Trainer
Dataset
- DPO_dataset
- μ체 μ μ dpo dataset(AI-hub dataset νμ©)
- OpenOrca DPO λ± μμ΄ λ°μ΄ν°μ λ²μ(ENERGY-DRINK-LOVE/translate_share_gpt_dedup_llama_SFT_1024, μ체λͺ¨λΈ νμ©)
Training Method
Benchmark
0 shot (macro f1)
kobest_boolq | kobest_copa | kobest_hellaswag | kobest_sentineg |
---|---|---|---|
0.931613 | 0.740751 | 0.468602 | 0.488465 |
- Downloads last month
- 2,384
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for ENERGY-DRINK-LOVE/nox_DPOv3
Base model
davidkim205/nox-solar-10.7b-v4