MohamedRashad's picture
Update README.md
36259c5 verified
---
library_name: transformers
license: llama3
datasets:
- 2A2I/argilla-dpo-mix-7k-arabic
language:
- ar
pipeline_tag: text-generation
---
# πŸ‘³ Arabic ORPO LLAMA 3
<center>
<img src="https://cdn-uploads.huggingface.co/production/uploads/6116d0584ef9fdfbf45dc4d9/3ns3O_bWYxKEXmozA073h.png">
</center>
## πŸ‘“ Story first
This model is the a finetuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) using [ORPO](https://github.com/xfactlab/orpo) on [2A2I/argilla-dpo-mix-7k-arabic](https://huggingface.co/datasets/2A2I/argilla-dpo-mix-7k-arabic).
I wanted to try ORPO and see if it will better align a biased English model like **llama3** to the arabic language or it will faill.
While the evaluations favour the base llama3 over my finetune, in practice i found my finetune was much better at spitting coherent (mostly correct) arabic text which i find interesting.
I would encourage everyone to try out the model from [here](https://huggingface.co/spaces/MohamedRashad/Arabic-Chatbot-Arena) and share his insights with me ^^
## πŸ€” Evaluation and Results
This result was made using [lighteval](https://github.com/huggingface/lighteval) with the __community|arabic_mmlu__ tasks.
| Community | Llama-3-8B-Instruct | Arabic-ORPO-Llama-3-8B-Instrcut |
|----------------------------------|---------------------|----------------------------------|
| **All** | **0.348** | **0.317** |
| Abstract Algebra | 0.310 | 0.230 |
| Anatomy | 0.385 | 0.348 |
| Astronomy | 0.388 | 0.316 |
| Business Ethics | 0.480 | 0.370 |
| Clinical Knowledge | 0.396 | 0.385 |
| College Biology | 0.347 | 0.299 |
| College Chemistry | 0.180 | 0.250 |
| College Computer Science | 0.250 | 0.190 |
| College Mathematics | 0.260 | 0.280 |
| College Medicine | 0.231 | 0.249 |
| College Physics | 0.225 | 0.216 |
| Computer Security | 0.470 | 0.440 |
| Conceptual Physics | 0.315 | 0.404 |
| Econometrics | 0.263 | 0.272 |
| Electrical Engineering | 0.414 | 0.359 |
| Elementary Mathematics | 0.320 | 0.272 |
| Formal Logic | 0.270 | 0.214 |
| Global Facts | 0.320 | 0.320 |
| High School Biology | 0.332 | 0.335 |
| High School Chemistry | 0.256 | 0.296 |
| High School Computer Science | 0.350 | 0.300 |
| High School European History | 0.224 | 0.242 |
| High School Geography | 0.323 | 0.364 |
| High School Government & Politics| 0.352 | 0.285 |
| High School Macroeconomics | 0.290 | 0.285 |
| High School Mathematics | 0.237 | 0.278 |
| High School Microeconomics | 0.231 | 0.273 |
| High School Physics | 0.252 | 0.225 |
| High School Psychology | 0.316 | 0.330 |
| High School Statistics | 0.199 | 0.176 |
| High School US History | 0.284 | 0.250 |
| High School World History | 0.312 | 0.274 |
| Human Aging | 0.369 | 0.430 |
| Human Sexuality | 0.481 | 0.321 |
| International Law | 0.603 | 0.405 |
| Jurisprudence | 0.491 | 0.370 |
| Logical Fallacies | 0.368 | 0.276 |
| Machine Learning | 0.214 | 0.312 |
| Management | 0.350 | 0.379 |
| Marketing | 0.521 | 0.547 |
| Medical Genetics | 0.320 | 0.330 |
| Miscellaneous | 0.446 | 0.443 |
| Moral Disputes | 0.422 | 0.306 |
| Moral Scenarios | 0.248 | 0.241 |
| Nutrition | 0.412 | 0.346 |
| Philosophy | 0.408 | 0.328 |
| Prehistory | 0.429 | 0.349 |
| Professional Accounting | 0.344 | 0.273 |
| Professional Law | 0.306 | 0.244 |
| Professional Medicine | 0.228 | 0.206 |
| Professional Psychology | 0.337 | 0.315 |
| Public Relations | 0.391 | 0.373 |
| Security Studies | 0.469 | 0.335 |
| Sociology | 0.498 | 0.408 |
| US Foreign Policy | 0.590 | 0.490 |
| Virology | 0.422 | 0.416 |
| World Religions | 0.404 | 0.304 |
| Average (All Communities) | 0.348 | 0.317 |