--- language: - zh - en license: apache-2.0 datasets: - Azure99/blossom-chat-v3 - Azure99/blossom-math-v4 - Azure99/blossom-wizard-v3 - Azure99/blossom-orca-v3 model-index: - name: blossom-v5-llama3-8b results: - task: type: text-generation name: Text Generation dataset: name: IFEval (0-Shot) type: HuggingFaceH4/ifeval args: num_few_shot: 0 metrics: - type: inst_level_strict_acc and prompt_level_strict_acc value: 43.43 name: strict accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Azure99/blossom-v5-llama3-8b name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: BBH (3-Shot) type: BBH args: num_few_shot: 3 metrics: - type: acc_norm value: 18.31 name: normalized accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Azure99/blossom-v5-llama3-8b name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MATH Lvl 5 (4-Shot) type: hendrycks/competition_math args: num_few_shot: 4 metrics: - type: exact_match value: 4.0 name: exact match source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Azure99/blossom-v5-llama3-8b name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GPQA (0-shot) type: Idavidrein/gpqa args: num_few_shot: 0 metrics: - type: acc_norm value: 2.01 name: acc_norm source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Azure99/blossom-v5-llama3-8b name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MuSR (0-shot) type: TAUR-Lab/MuSR args: num_few_shot: 0 metrics: - type: acc_norm value: 5.31 name: acc_norm source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Azure99/blossom-v5-llama3-8b name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU-PRO (5-shot) type: TIGER-Lab/MMLU-Pro config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 13.4 name: accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Azure99/blossom-v5-llama3-8b name: Open LLM Leaderboard --- # **BLOSSOM-v5-llama3-8b** [💻Github](https://github.com/Azure99/BlossomLM) • [🚀Blossom Chat Demo](https://blossom-chat.com/) ### What's new? The Blossom V5 series models is fully trained using high-quality data distilled from gpt-4-0125-preview, resulting in significant improvements. ### Introduction Blossom is a conversational large language model, fine-tuned on the Blossom Orca/Wizard/Chat/Math mixed dataset based on the Meta-Llama-3-8B pre-trained model. Blossom possesses robust general capabilities and context comprehension. Additionally, the high-quality Chinese and English datasets used for training have been made open source. Training was conducted in two stages. The first stage used 40K Wizard, 40K Orca, 10K Math single-turn instruction datasets, training for 1 epoch; the second stage used 10K Blossom chat multi-turn dialogue dataset, and 10% randomly sampled data from the first stage, training for 3 epochs. ### Inference Inference is performed in the form of dialogue continuation. Single-turn dialogue ``` A chat between a human and an artificial intelligence bot. The bot gives helpful, detailed, and polite answers to the human's questions. |Human|: hello |Bot|: ``` Multi-turn dialogue ``` A chat between a human and an artificial intelligence bot. The bot gives helpful, detailed, and polite answers to the human's questions. |Human|: hello |Bot|: Hello! How can I assist you today?<|end_of_text|> |Human|: Generate a random number using python |Bot|: ``` Note: At the end of the Bot's output in the historical conversation, append a `<|end_of_text|>`. # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Azure99__blossom-v5-llama3-8b) | Metric |Value| |-------------------|----:| |Avg. |14.41| |IFEval (0-Shot) |43.43| |BBH (3-Shot) |18.31| |MATH Lvl 5 (4-Shot)| 4.00| |GPQA (0-shot) | 2.01| |MuSR (0-shot) | 5.31| |MMLU-PRO (5-shot) |13.40|