This is WizardLM-13B V1.0 diff weight. Project Repo: https://github.com/nlpxucan/WizardLM NOTE: The **WizardLM-13B-1.0** and **Wizard-7B** use different prompt at the beginning of the conversation: For **WizardLM-13B-1.0** , the Prompt should be as following: ``` A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: hello, who are you? ASSISTANT: ``` For **WizardLM-7B** , the Prompt should be as following: ``` {instruction}\n\n### Response: ```

🤗 HF Repo • 🐦 Twitter • 📃 [WizardLM] • 📃 [WizardCoder] • 📃 [WizardMath]

👋 Join our Discord

| Model | Checkpoint | Paper | HumanEval | MBPP | Demo | License | | ----- |------| ---- |------|-------| ----- | ----- | | WizardCoder-Python-34B-V1.0 | 🤗 HF Link | 📃 [WizardCoder] | 73.2 | 61.2 | [Demo](http://47.103.63.15:50085/) | Llama2 | | WizardCoder-15B-V1.0 | 🤗 HF Link | 📃 [WizardCoder] | 59.8 |50.6 | -- | OpenRAIL-M | | WizardCoder-Python-13B-V1.0 | 🤗 HF Link | 📃 [WizardCoder] | 64.0 | 55.6 | -- | Llama2 | | WizardCoder-3B-V1.0 | 🤗 HF Link | 📃 [WizardCoder] | 34.8 |37.4 | -- | OpenRAIL-M | | WizardCoder-1B-V1.0 | 🤗 HF Link | 📃 [WizardCoder] | 23.8 |28.6 | -- | OpenRAIL-M | | Model | Checkpoint | Paper | GSM8k | MATH |Online Demo| License| | ----- |------| ---- |------|-------| ----- | ----- | | WizardMath-70B-V1.0 | 🤗 HF Link | 📃 [WizardMath]| **81.6** | **22.7** |[Demo](http://47.103.63.15:50083/)| Llama 2 | | WizardMath-13B-V1.0 | 🤗 HF Link | 📃 [WizardMath]| **63.9** | **14.0** |[Demo](http://47.103.63.15:50082/)| Llama 2 | | WizardMath-7B-V1.0 | 🤗 HF Link | 📃 [WizardMath]| **54.9** | **10.7** | [Demo](http://47.103.63.15:50080/)| Llama 2 | | ^Model | ^Checkpoint | ^Paper |^MT-Bench | ^AlpacaEval | ^GSM8k | ^HumanEval | ^License| | ----- |------| ---- |------|-------| ----- | ----- | ----- | | ^{**WizardLM-70B-V1.0**} | ^{🤗 HF Link}|^{📃**Coming Soon**}| ^**7.78** | ^**92.91%** |^**77.6%** | ^{**50.6 pass@1**}|^{Llama 2 License} | | ^{WizardLM-13B-V1.2} | ^{🤗 HF Link}| | ^7.06 | ^89.17% |^55.3% | ^{36.6 pass@1}|^{Llama 2 License} | | ^{WizardLM-13B-V1.1} |^{🤗 HF Link} | | ^6.76 |^86.32% | | ^{25.0 pass@1}| ^{Non-commercial}| | ^{WizardLM-30B-V1.0} | ^{🤗 HF Link} | | ^7.01 | | | ^{37.8 pass@1}| ^{Non-commercial} | | ^{WizardLM-13B-V1.0} | ^{🤗 HF Link} | | ^6.35 | ^75.31% | | ^{24.0 pass@1} | ^{Non-commercial}| | ^{WizardLM-7B-V1.0}| ^{🤗 HF Link} |^{📃 [WizardLM]}| | | |^{19.1 pass@1}|^{Non-commercial}| **Github Repo**: https://github.com/nlpxucan/WizardLM/tree/main/WizardMath **Twitter**: https://twitter.com/WizardLM_AI/status/1689998428200112128 **Discord**: https://discord.gg/VZjjHtWrKs ## Inference WizardLM Demo Script We provide the inference WizardLM demo code [here](https://github.com/nlpxucan/WizardLM/tree/main/demo). # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_victor123__WizardLM-13B-1.0) | Metric | Value | |-----------------------|---------------------------| | Avg. | 25.09 | | ARC (25-shot) | 28.5 | | HellaSwag (10-shot) | 25.97 | | MMLU (5-shot) | 23.12 | | TruthfulQA (0-shot) | 48.61 | | Winogrande (5-shot) | 49.41 | | GSM8K (5-shot) | 0.0 | | DROP (3-shot) | 0.0 |