Fine-tune of Upstage AI's SOLAR-10.7B-Instruct-v1.0 model, using the OpenHermes, Platypus, and Capybara datasets. Additionally fine-tuned on Jon Durbin's Bagel v0.3, plus a few unreleased datasets.

Fine-tuned on 8x4090s for 1.25 epochs.

Model Sources [optional]

Repository: TBD
Demo: TBD

Bias, Risks, and Limitations

This fine-tune has had zero alignment, safety data, or anything else shoved down it's throat.

Training Details

Training Data

See the sidebar for links to the relevant datasets.

Training Procedure

Trained using QLORA via the Axolotl tool.

Evaluation

TBD

Training procedure

The following bitsandbytes quantization config was used during training:

quant_method: bitsandbytes
load_in_8bit: False
load_in_4bit: True
llm_int8_threshold: 6.0
llm_int8_skip_modules: None
llm_int8_enable_fp32_cpu_offload: False
llm_int8_has_fp16_weight: False
bnb_4bit_quant_type: nf4
bnb_4bit_use_double_quant: True
bnb_4bit_compute_dtype: bfloat16

Framework versions

PEFT 0.6.0

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	70.94
AI2 Reasoning Challenge (25-Shot)	69.03
HellaSwag (10-Shot)	87.54
MMLU (5-Shot)	66.19
TruthfulQA (0-shot)	59.17
Winogrande (5-shot)	83.19
GSM8k (5-shot)	60.50

decapoda-research
/

Antares-11b-v2