File size: 2,382 Bytes
0fb9996
 
1f41f38
 
 
 
 
 
 
d8c2e18
14f5dfc
0fb9996
1f41f38
 
 
0e922d4
 
334636b
 
0e922d4
 
 
 
 
 
 
a03262f
 
 
 
 
 
 
 
 
1f41f38
 
 
 
 
 
 
 
a03262f
 
 
 
 
 
4c8f4c4
faa39c8
4c8f4c4
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
---
license: mit
datasets:
- argilla/distilabel-intel-orca-dpo-pairs
- jondurbin/truthy-dpo-v0.1
- argilla/distilabel-math-preference-dpo
- argilla/distilabel-capybara-dpo-7k-binarized
language:
- en
library_name: adapter-transformers
base_model: Technoculture/MT7Bi-sft
---

# Technoculture/MT7Bi-alpha-dpo-v-0.2

# Open LLM Leaderboard

![image/png](https://cdn-uploads.huggingface.co/production/uploads/63486df1f8f01fcc4b23e97d/luE7VGrwNmRediBuPGtfI.png)

| Model Name           | ARC      | HellaSwag | MMLU   | TruthfulQA | Winogrande | GSM8K    |
| -------------------- | -------- | --------- | ------ | ---------- | ---------- | -------- |
| Orca-2-7b            | **78.4** | 76.1      | 53.7   | **52.4**   | **74.2**   | **47.2** |
| LLAMA-2-7b           | 43.2     | **77.1**  | 44.4   | 38.7       | 69.5       | 16       |
| MT7Bi-sft            | 54.1     | 75.11     | -      | 43.08      | 72.14      | 15.54    |
| MT7Bi-alpha-dpo-v0.2 | 54.69    | 75.89     | 52.82  | 45.48      | 71.58      | 25.93    |

## Training Details

- **GPU:** Nvidia A100 Tensor Core GPU
- **Total Batches:** 4266
- **Epochs:** 3
- **Duration:** 3 hours, 59 minutes, and 55 seconds


## DPO Training Dataset Mixture
| Dataset Name                                       | Original Size(Rows) | Ratio | Size After Ratio(Rows) |
|----------------------------------------------------|---------------|-------|------------------|
| argilla/distilabel-math-preference-dpo            | 2.4k | 1.0   | 2.4k           | 
| argilla/distilabel-intel-orca-dpo-pairs           | 12.9k | 0.5   | 6.45k           | 
| jondurbin/truthy-dpo-v0.1                         | 1.04k | 1.0   | 1.04k           |
| argilla/distilabel-capybara-dpo-7k-binarized      | 7.5k | 0.2   | 1.5k           | 
Total Size: 11.38k

## Training Loss Plot
![image/png](https://cdn-uploads.huggingface.co/production/uploads/658bed1c8ff537204fbd92a3/CKi7ArBnCyuidJPHo3M5T.png)

## Training Loss Smoothed Plot
![image/png](https://cdn-uploads.huggingface.co/production/uploads/658bed1c8ff537204fbd92a3/tFyGJLw3Vj3m2jaaWk66E.png)

### For full details of this dpo-training please go through our notebook.

<a target="_blank" href="https://colab.research.google.com/github/dkshjn/Technoculture/blob/main/MT7Bi_alpha_dpo_v0_2.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>