--- language: - en license: other library_name: transformers datasets: - psmathur/orca_mini_v1_dataset - ehartford/dolphin pipeline_tag: text-generation model-index: - name: orca_mini_v3_7b results: - task: type: text-generation name: Text Generation dataset: name: AI2 Reasoning Challenge (25-Shot) type: ai2_arc config: ARC-Challenge split: test args: num_few_shot: 25 metrics: - type: acc_norm value: 56.91 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=psmathur/orca_mini_v3_7b name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: HellaSwag (10-Shot) type: hellaswag split: validation args: num_few_shot: 10 metrics: - type: acc_norm value: 79.64 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=psmathur/orca_mini_v3_7b name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU (5-Shot) type: cais/mmlu config: all split: test args: num_few_shot: 5 metrics: - type: acc value: 52.37 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=psmathur/orca_mini_v3_7b name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: TruthfulQA (0-shot) type: truthful_qa config: multiple_choice split: validation args: num_few_shot: 0 metrics: - type: mc2 value: 50.51 source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=psmathur/orca_mini_v3_7b name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Winogrande (5-shot) type: winogrande config: winogrande_xl split: validation args: num_few_shot: 5 metrics: - type: acc value: 74.27 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=psmathur/orca_mini_v3_7b name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GSM8k (5-shot) type: gsm8k config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 7.13 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=psmathur/orca_mini_v3_7b name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: IFEval (0-Shot) type: HuggingFaceH4/ifeval args: num_few_shot: 0 metrics: - type: inst_level_strict_acc and prompt_level_strict_acc value: 28.21 name: strict accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=pankajmathur/orca_mini_v3_7b name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: BBH (3-Shot) type: BBH args: num_few_shot: 3 metrics: - type: acc_norm value: 17.84 name: normalized accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=pankajmathur/orca_mini_v3_7b name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MATH Lvl 5 (4-Shot) type: hendrycks/competition_math args: num_few_shot: 4 metrics: - type: exact_match value: 0.3 name: exact match source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=pankajmathur/orca_mini_v3_7b name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GPQA (0-shot) type: Idavidrein/gpqa args: num_few_shot: 0 metrics: - type: acc_norm value: 0.0 name: acc_norm source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=pankajmathur/orca_mini_v3_7b name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MuSR (0-shot) type: TAUR-Lab/MuSR args: num_few_shot: 0 metrics: - type: acc_norm value: 22.71 name: acc_norm source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=pankajmathur/orca_mini_v3_7b name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU-PRO (5-shot) type: TIGER-Lab/MMLU-Pro config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 12.04 name: accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=pankajmathur/orca_mini_v3_7b name: Open LLM Leaderboard --- # orca_mini_v3_7b A LLama2-7b model trained on Orca Style datasets.
![orca-mini](https://huggingface.co/psmathur/orca_mini_v3_7b/resolve/main/orca_minis_small.jpeg)
🤔 How good is orca-mini-v3-7b? Do the evaluation results from HuggingFace Open LLM leaderboard translate to real-world use cases? 🔍 Now you can figure it out for yourself! Introducing the orca-mini chatbot powered by the orca-mini-v3-7b model. Dive in and see how the open source 7b model stacks up in the world of massive language models. 🌍 ⏰ Hurry up before I run out of GPU credits! 😉 Check it out here 👉 [https://huggingface.co/spaces/psmathur/psmathur-orca_mini_v3_7b](https://huggingface.co/spaces/psmathur/psmathur-orca_mini_v3_7b)
**P.S. If you're interested to collaborate, please connect with me at www.linkedin.com/in/pankajam.**
### quantized versions Big thanks to [@TheBloke](https://huggingface.co/TheBloke) 1) https://huggingface.co/TheBloke/orca_mini_v3_7B-GGML 2) https://huggingface.co/TheBloke/orca_mini_v3_7B-GPTQ
#### license disclaimer: This model is bound by the license & usage restrictions of the original Llama-2 model. And comes with no warranty or gurantees of any kind.
## evaluation We evaluated orca_mini_v3_7b on a wide range of tasks using [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) from EleutherAI. Here are the results on metrics used by [HuggingFaceH4 Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) ||||| |:------:|:--------:|:-------:|:--------:| |**Task**|**Metric**|**Value**|**Stderr**| |*arc_challenge*|acc_norm|0.5717|0.0145| |*hellaswag*|acc_norm|0.7966|0.0043| |*mmlu*|acc_norm|0.5234|0.035| |*truthfulqa_mc*|mc2|0.5029|0.0156| |**Total Average**|-|**0.59865**||
## example esage Here is prompt format ``` ### System: You are an AI assistant that follows instruction extremely well. Help as much as you can. ### User: Tell me about Orcas. ### Assistant: ``` Below shows a code example on how to use this model ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline tokenizer = AutoTokenizer.from_pretrained("psmathur/orca_mini_v3_7b", use_fast=False) model = AutoModelForCausalLM.from_pretrained( "psmathur/orca_mini_v3_7b", torch_dtype=torch.float16, load_in_8bit=True, low_cpu_mem_usage=True, device_map="auto" ) system_prompt = "### System:\nYou are an AI assistant that follows instruction extremely well. Help as much as you can.\n\n" #generate text steps instruction = "Tell me about Orcas." prompt = f"{system_prompt}### User: {instruction}\n\n### Assistant:\n" inputs = tokenizer(prompt, return_tensors="pt").to("cuda") output = model.generate(**inputs, do_sample=True, top_p=0.95, top_k=0, max_new_tokens=4096) print(tokenizer.decode(output[0], skip_special_tokens=True)) ```
#### limitations & biases: While this model aims for accuracy, it can occasionally produce inaccurate or misleading results. Despite diligent efforts in refining the pretraining data, there remains a possibility for the generation of inappropriate, biased, or offensive content. Exercise caution and cross-check information when necessary.
### citiation: Please kindly cite using the following BibTeX: ``` @misc{orca_mini_v3_7b, author = {Pankaj Mathur}, title = {orca_mini_v3_7b: An explain tuned Llama2-7b model}, year = {2023}, publisher = {GitHub, HuggingFace}, journal = {GitHub repository, HuggingFace repository}, howpublished = {\url{https://https://huggingface.co/psmathur/orca_mini_v3_7b}, } ``` ``` @misc{mukherjee2023orca, title={Orca: Progressive Learning from Complex Explanation Traces of GPT-4}, author={Subhabrata Mukherjee and Arindam Mitra and Ganesh Jawahar and Sahaj Agarwal and Hamid Palangi and Ahmed Awadallah}, year={2023}, eprint={2306.02707}, archivePrefix={arXiv}, primaryClass={cs.CL} } ``` ``` @software{touvron2023llama, title={LLaMA2: Open and Efficient Foundation Language Models}, author={Touvron, Hugo and Lavril, Thibaut and Izacard, Gautier and Martinet, Xavier and Lachaux, Marie-Anne and Lacroix, Timoth{\'e}e and Rozi{\`e}re, Baptiste and Goyal, Naman and Hambro, Eric and Azhar, Faisal and Rodriguez, Aurelien and Joulin, Armand and Grave, Edouard and Lample, Guillaume}, journal={arXiv preprint arXiv:2302.13971}, year={2023} } ``` # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_psmathur__orca_mini_v3_7b) | Metric | Value | |-----------------------|---------------------------| | Avg. | 47.98 | | ARC (25-shot) | 56.91 | | HellaSwag (10-shot) | 79.64 | | MMLU (5-shot) | 52.37 | | TruthfulQA (0-shot) | 50.51 | | Winogrande (5-shot) | 74.27 | | GSM8K (5-shot) | 7.13 | | DROP (3-shot) | 15.06 | # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_psmathur__orca_mini_v3_7b) | Metric |Value| |---------------------------------|----:| |Avg. |53.47| |AI2 Reasoning Challenge (25-Shot)|56.91| |HellaSwag (10-Shot) |79.64| |MMLU (5-Shot) |52.37| |TruthfulQA (0-shot) |50.51| |Winogrande (5-shot) |74.27| |GSM8k (5-shot) | 7.13| # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_pankajmathur__orca_mini_v3_7b) | Metric |Value| |-------------------|----:| |Avg. |13.52| |IFEval (0-Shot) |28.21| |BBH (3-Shot) |17.84| |MATH Lvl 5 (4-Shot)| 0.30| |GPQA (0-shot) | 0.00| |MuSR (0-shot) |22.71| |MMLU-PRO (5-shot) |12.04|