leaderboard-pr-bot's picture
Adding Evaluation Results
66d03fb verified
|
raw
history blame
4.75 kB
---
license: apache-2.0
datasets:
- databricks/databricks-dolly-15k
pipeline_tag: text-generation
model-index:
- name: Instruct_Mixtral-8x7B-v0.1_Dolly15K
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: AI2 Reasoning Challenge (25-Shot)
type: ai2_arc
config: ARC-Challenge
split: test
args:
num_few_shot: 25
metrics:
- type: acc_norm
value: 69.28
name: normalized accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Brillibits/Instruct_Mixtral-8x7B-v0.1_Dolly15K
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: HellaSwag (10-Shot)
type: hellaswag
split: validation
args:
num_few_shot: 10
metrics:
- type: acc_norm
value: 87.59
name: normalized accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Brillibits/Instruct_Mixtral-8x7B-v0.1_Dolly15K
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU (5-Shot)
type: cais/mmlu
config: all
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 70.96
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Brillibits/Instruct_Mixtral-8x7B-v0.1_Dolly15K
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: TruthfulQA (0-shot)
type: truthful_qa
config: multiple_choice
split: validation
args:
num_few_shot: 0
metrics:
- type: mc2
value: 64.83
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Brillibits/Instruct_Mixtral-8x7B-v0.1_Dolly15K
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: Winogrande (5-shot)
type: winogrande
config: winogrande_xl
split: validation
args:
num_few_shot: 5
metrics:
- type: acc
value: 82.56
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Brillibits/Instruct_Mixtral-8x7B-v0.1_Dolly15K
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GSM8k (5-shot)
type: gsm8k
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 59.44
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Brillibits/Instruct_Mixtral-8x7B-v0.1_Dolly15K
name: Open LLM Leaderboard
---
# Instruct_Mixtral-8x7B-v0.1_Dolly15K
Fine-tuned from Mixtral-8x7B-v0.1, used Dolly15k for the dataset. 85% for training, 14.9% validation, 0.1% test. Trained for 1.0 epochs using QLora. Trained with 1024 context window.
# Model Details
* **Trained by**: trained by [Brillibits](https://www.youtube.com/@Brillibits).
* **Model type:** **Instruct_Mixtral-8x7B-v0.1_Dolly15K** is an auto-regressive language model based on the Llama 2 transformer architecture.
* **Language(s)**: English
* **License for Instruct_Mixtral-8x7B-v0.1_Dolly15K**: apache-2.0 license
# Prompting
## Prompt Template With Context
```
Write a 10-line poem about a given topic
Input:
The topic is about racecars
Output:
```
## Prompt Template Without Context
```
Who was the was the second president of the United States?
Output:
```
## Professional Assistance
This model and other models like it are great, but where LLMs hold the most promise is when they are applied on custom data to automate a wide variety of tasks
If you have a dataset and want to see if you might be able to apply that data to automate some tasks, and you are looking for professional assistance, contact me [here](mailto:[email protected])
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Brillibits__Instruct_Mixtral-8x7B-v0.1_Dolly15K)
| Metric |Value|
|---------------------------------|----:|
|Avg. |72.44|
|AI2 Reasoning Challenge (25-Shot)|69.28|
|HellaSwag (10-Shot) |87.59|
|MMLU (5-Shot) |70.96|
|TruthfulQA (0-shot) |64.83|
|Winogrande (5-shot) |82.56|
|GSM8k (5-shot) |59.44|