---
language:
- id
- en
license: cc-by-nc-sa-4.0
datasets:
- wikipedia
- Ichsan2895/OASST_Top1_Indonesian
- Ichsan2895/alpaca-gpt4-indonesian
pipeline_tag: text-generation
model-index:
- name: Merak-7B-v5-PROTOTYPE1
  results:
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: AI2 Reasoning Challenge (25-Shot)
      type: ai2_arc
      config: ARC-Challenge
      split: test
      args:
        num_few_shot: 25
    metrics:
    - type: acc_norm
      value: 62.2
      name: normalized accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Ichsan2895/Merak-7B-v5-PROTOTYPE1
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: HellaSwag (10-Shot)
      type: hellaswag
      split: validation
      args:
        num_few_shot: 10
    metrics:
    - type: acc_norm
      value: 82.07
      name: normalized accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Ichsan2895/Merak-7B-v5-PROTOTYPE1
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MMLU (5-Shot)
      type: cais/mmlu
      config: all
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 60.97
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Ichsan2895/Merak-7B-v5-PROTOTYPE1
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: TruthfulQA (0-shot)
      type: truthful_qa
      config: multiple_choice
      split: validation
      args:
        num_few_shot: 0
    metrics:
    - type: mc2
      value: 45.41
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Ichsan2895/Merak-7B-v5-PROTOTYPE1
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: Winogrande (5-shot)
      type: winogrande
      config: winogrande_xl
      split: validation
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 77.9
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Ichsan2895/Merak-7B-v5-PROTOTYPE1
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: GSM8k (5-shot)
      type: gsm8k
      config: main
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 37.23
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Ichsan2895/Merak-7B-v5-PROTOTYPE1
      name: Open LLM Leaderboard
---
<div style="width: auto; margin-left: auto; margin-right: auto">
<img src="https://huggingface.co/Ichsan2895/Merak-7B-v4/resolve/main/FINAL_LOGO/6.png" alt="MERAK" style="width: 50%; min-width: 100px; display: block; margin: auto;">
</div>

# THIS IS 1st PROTOTYPE OF MERAK-7B-v5!

Merak-7B is the Large Language Model of Indonesian Language 

This model is based on [Mistral-7B-OpenOrca](https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca) and fine tuned by some of Indonesia Wikipedia articles that I cleaned before.

Leveraging QLoRA (QLora: Efficient Finetuning of Quantized LLMs), Merak-7B is able to run with 16 GB VRAM. We also use DPO-Trainer for RLHF with TRL library..

Licensed under Creative Commons-By Attribution-Share Alike-Non Commercial (CC-BY-SA-NC 4.0) Merak-7B empowers AI enthusiasts, researchers alike.

Big thanks to all my friends and communities that help to build our first model. Thanks for Axolotl for a great fine tuning tool which designed to streamline the fine-tuning of various AI models. 

Feel free, to ask me about the model and please share the news on your social media.

## CITATION
```
@software{lian2023mistralorca1
  title = {MistralOrca: Mistral-7B Model Instruct-tuned on Filtered OpenOrcaV1 GPT-4 Dataset},
  author = {Wing Lian and Bleys Goodson and Guan Wang and Eugene Pentland and Austin Cook and Chanvichet Vong and "Teknium"},
  year = {2023},
  publisher = {HuggingFace},
  journal = {HuggingFace repository},
  howpublished = {\url{https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca},
}

@misc{mukherjee2023orca,
      title={Orca: Progressive Learning from Complex Explanation Traces of GPT-4}, 
      author={Subhabrata Mukherjee and Arindam Mitra and Ganesh Jawahar and Sahaj Agarwal and Hamid Palangi and Ahmed Awadallah},
      year={2023},
      eprint={2306.02707},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

@ONLINE{wikidump,
    author = "Wikimedia Foundation",
    title  = "Wikimedia Downloads",
    url    = "https://dumps.wikimedia.org"
}

@inproceedings{wolf-etal-2020-transformers,
    title = "Transformers: State-of-the-Art Natural Language Processing",
    author = "Thomas Wolf and Lysandre Debut and Victor Sanh and Julien Chaumond and Clement Delangue and Anthony Moi and Pierric Cistac and Tim Rault and Rémi Louf and Morgan Funtowicz and Joe Davison and Sam Shleifer and Patrick von Platen and Clara Ma and Yacine Jernite and Julien Plu and Canwen Xu and Teven Le Scao and Sylvain Gugger and Mariama Drame and Quentin Lhoest and Alexander M. Rush",
    booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations",
    month = oct,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.emnlp-demos.6",
    pages = "38--45"
}

@misc{vonwerra2022trl,
  author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang},
  title = {TRL: Transformer Reinforcement Learning},
  year = {2020},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/huggingface/trl}}
}

@article{dettmers2023qlora,
  title   = {QLoRA: Efficient Finetuning of Quantized LLMs},
  author  = {Dettmers, Tim and Pagnoni, Artidoro and Holtzman, Ari and Zettlemoyer, Luke},
  journal = {arXiv preprint arXiv:2305.14314},
  year    = {2023}
}
```
[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)

## HOW TO CITE THIS PROJECT

If you use the Merak-7B model in your research or project, please cite it as:
```
@article{Merak,
  title={Merak-7B: The LLM for Bahasa Indonesia},
  author={Muhammad Ichsan},
  publisher={Hugging Face}
  journal={Hugging Face Repository},
  year={2023}
}
```
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Ichsan2895__Merak-7B-v5-PROTOTYPE1)

|             Metric              |Value|
|---------------------------------|----:|
|Avg.                             |60.96|
|AI2 Reasoning Challenge (25-Shot)|62.20|
|HellaSwag (10-Shot)              |82.07|
|MMLU (5-Shot)                    |60.97|
|TruthfulQA (0-shot)              |45.41|
|Winogrande (5-shot)              |77.90|
|GSM8k (5-shot)                   |37.23|