File size: 6,990 Bytes
5b568f7
 
83a8410
 
 
 
 
 
5b568f7
 
50d8e5c
912f4c3
42fab36
912f4c3
5b568f7
42fab36
50d8e5c
5b568f7
004978d
5b568f7
50d8e5c
5b568f7
50d8e5c
5b568f7
50d8e5c
 
 
 
36259c5
5b568f7
7be2712
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
---
library_name: transformers
license: llama3
datasets:
- 2A2I/argilla-dpo-mix-7k-arabic
language:
- ar
pipeline_tag: text-generation
---

# 👳 Arabic ORPO LLAMA 3
<center>
  <img src="https://cdn-uploads.huggingface.co/production/uploads/6116d0584ef9fdfbf45dc4d9/3ns3O_bWYxKEXmozA073h.png">
</center>


## 👓 Story first

This model is the a finetuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) using [ORPO](https://github.com/xfactlab/orpo) on [2A2I/argilla-dpo-mix-7k-arabic](https://huggingface.co/datasets/2A2I/argilla-dpo-mix-7k-arabic).

I wanted to try ORPO and see if it will better align a biased English model like **llama3** to the arabic language or it will faill.

While the evaluations favour the base llama3 over my finetune, in practice i found my finetune was much better at spitting coherent (mostly correct) arabic text which i find interesting.

I would encourage everyone to try out the model from [here](https://huggingface.co/spaces/MohamedRashad/Arabic-Chatbot-Arena) and share his insights with me ^^

## 🤔 Evaluation and Results

This result was made using [lighteval](https://github.com/huggingface/lighteval) with the __community|arabic_mmlu__ tasks.

| Community                        | Llama-3-8B-Instruct | Arabic-ORPO-Llama-3-8B-Instrcut  |
|----------------------------------|---------------------|----------------------------------|
| **All**                          | **0.348**           | **0.317**                        |
| Abstract Algebra                 | 0.310               | 0.230                            |
| Anatomy                          | 0.385               | 0.348                            |
| Astronomy                        | 0.388               | 0.316                            |
| Business Ethics                  | 0.480               | 0.370                            |
| Clinical Knowledge               | 0.396               | 0.385                            |
| College Biology                  | 0.347               | 0.299                            |
| College Chemistry                | 0.180               | 0.250                            |
| College Computer Science         | 0.250               | 0.190                            |
| College Mathematics              | 0.260               | 0.280                            |
| College Medicine                 | 0.231               | 0.249                            |
| College Physics                  | 0.225               | 0.216                            |
| Computer Security                | 0.470               | 0.440                            |
| Conceptual Physics               | 0.315               | 0.404                            |
| Econometrics                     | 0.263               | 0.272                            |
| Electrical Engineering           | 0.414               | 0.359                            |
| Elementary Mathematics           | 0.320               | 0.272                            |
| Formal Logic                     | 0.270               | 0.214                            |
| Global Facts                     | 0.320               | 0.320                            |
| High School Biology              | 0.332               | 0.335                            |
| High School Chemistry            | 0.256               | 0.296                            |
| High School Computer Science     | 0.350               | 0.300                            |
| High School European History     | 0.224               | 0.242                            |
| High School Geography            | 0.323               | 0.364                            |
| High School Government & Politics| 0.352               | 0.285                            |
| High School Macroeconomics       | 0.290               | 0.285                            |
| High School Mathematics          | 0.237               | 0.278                            |
| High School Microeconomics       | 0.231               | 0.273                            |
| High School Physics              | 0.252               | 0.225                            |
| High School Psychology           | 0.316               | 0.330                            |
| High School Statistics           | 0.199               | 0.176                            |
| High School US History           | 0.284               | 0.250                            |
| High School World History        | 0.312               | 0.274                            |
| Human Aging                      | 0.369               | 0.430                            |
| Human Sexuality                  | 0.481               | 0.321                            |
| International Law                | 0.603               | 0.405                            |
| Jurisprudence                    | 0.491               | 0.370                            |
| Logical Fallacies                | 0.368               | 0.276                            |
| Machine Learning                 | 0.214               | 0.312                            |
| Management                       | 0.350               | 0.379                            |
| Marketing                        | 0.521               | 0.547                            |
| Medical Genetics                 | 0.320               | 0.330                            |
| Miscellaneous                    | 0.446               | 0.443                            |
| Moral Disputes                   | 0.422               | 0.306                            |
| Moral Scenarios                  | 0.248               | 0.241                            |
| Nutrition                        | 0.412               | 0.346                            |
| Philosophy                       | 0.408               | 0.328                            |
| Prehistory                       | 0.429               | 0.349                            |
| Professional Accounting          | 0.344               | 0.273                            |
| Professional Law                 | 0.306               | 0.244                            |
| Professional Medicine            | 0.228               | 0.206                            |
| Professional Psychology          | 0.337               | 0.315                            |
| Public Relations                 | 0.391               | 0.373                            |
| Security Studies                 | 0.469               | 0.335                            |
| Sociology                        | 0.498               | 0.408                            |
| US Foreign Policy                | 0.590               | 0.490                            |
| Virology                         | 0.422               | 0.416                            |
| World Religions                  | 0.404               | 0.304                            |
| Average (All Communities)        | 0.348               | 0.317                            |