|
--- |
|
license: cc-by-4.0 |
|
base_model: davidkim205/komt-solar-10.7b-sft-v5 |
|
tags: |
|
- trl |
|
- dpo |
|
- generated_from_trainer |
|
model-index: |
|
- name: nhn_dpo_v3_komt-solar-10.7b-sft-v5_DPO |
|
results: [] |
|
--- |
|
|
|
|
|
|
|
# ENERGY-DRINK-LOVE/komt_DPOv3 |
|
|
|
### Our Team |
|
* Youjin Chung |
|
* Jingyeom Kim |
|
|
|
## Model |
|
|
|
### Base Model |
|
* [davidkim205/komt-solar-10.7b-sft-v5](https://huggingface.co/davidkim205/komt-solar-10.7b-sft-v5) |
|
|
|
### Hardware and Software |
|
* Hardware: A100 * 8 for training our model |
|
* Deepspeed library & Huggingface TRL Trainer |
|
|
|
### Dataset |
|
* DPO_dataset |
|
* ์์ฒด ์ ์ dpo dataset(AI-hub dataset ํ์ฉ) |
|
* OpenOrca DPO ๋ฑ ์์ด ๋ฐ์ดํฐ์
๋ฒ์ญ(ENERGY-DRINK-LOVE/translate_share_gpt_dedup_llama_SFT_1024, ์์ฒด๋ชจ๋ธ ํ์ฉ) |
|
|
|
### Training Method |
|
* [DPO](https://arxiv.org/abs/2305.18290) |
|
|
|
## Benchmark |
|
|
|
**[Ko LM Eval Harness](https://github.com/Beomi/ko-lm-evaluation-harness)** |
|
|
|
|
|
**[Ko-LLM-Leaderboard](https://www.aihub.or.kr/leaderboard/view.do?currMenu=500&topMenu=102)** |
|
* (240316๊ธฐ์ค 4๋ฑ) |
|
* ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6551c0e37bbfce18781a8748/xKS2X4hfrs100mpr4Jv89.png) |
|
|
|
| Average | Ko-ARC | Ko-HellaSwag | Ko-MMLU | Ko-TruthfulQA | Ko-CommonGen V2 | |
|
| ------: | -----: | -----------: | ------: | ------------: | --------------: | |
|
| 61.20 | 57.51 | 70.33 | 53.34 | 68.49 | 56.32 | |
|
|
|
|
|
|
|
|