deacon-13b / README.md
leaderboard-pr-bot's picture
Adding Evaluation Results
1241713
|
raw
history blame
1.59 kB
metadata
license: cc-by-nc-4.0
datasets:
  - KnutJaegersberg/facehugger

image/png

This model was fine tuned on AI filtered subsets of GPT-4 based subset of the Dolphin dataset and EvolInstruct V2. It has not been explicitly aligned to positive, negative or bureaucratically prescribed value systems. It might kill us all! Time to shit your pants, regulators. I literally put black goo on Dolphin-7B sperm, which then fertilized Evolved Instructions... What's different is evil... ;) I intend to train 3 sizes.

Prompt Example:

### System:

You are an AI assistant. User will you give you a task. Your goal is to complete the task as faithfully as you can. While performing the task think step-by-step and justify your steps.


### Instruction: 

How do you fine tune a large language model? 

### Response:

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 46.78
ARC (25-shot) 57.85
HellaSwag (10-shot) 82.63
MMLU (5-shot) 55.25
TruthfulQA (0-shot) 39.33
Winogrande (5-shot) 76.32
GSM8K (5-shot) 10.39
DROP (3-shot) 5.67