Joseph717171/Llama-3.1-SuperNova-8B-Lite_TIES_with_Base

Owner Oct 3

•

@chargoddard , @Crystalcareai , @Undi95 Please explain these benchmark results. How can merging an instruct model with its ancestral base model improve the model in each benchmark? Why is there no degradation in performance or loss in Benchmark scores like we typically see in model merges? 🤔

Joseph717171/Llama-3.1-SuperNova-8B-Lite_TIES_with_Base's Benchmarks

Metric	Value
Average Score	43.06
IFEval (0-Shot)	80.96
BBH (3-Shot)	51.10
MATH Lvl 5 (4-Shot. )	15.56
GPQA (0-shot)	30.96
MuSR (0-shot)	41.01
MMLU-PRO (5-shot)	38.80

arcee-ai/Llama-3.1-SuperNova-Lite's Benchmarks

Metric	Value
Average Score	29.73
IFEval (0-Shot)	80.17
BBH (3-Shot)	31.57
MATH Lvl 5 (4-Shot)	15.48
GPQA (0-shot)	7.49
MuSR (0-shot)	11.67
MMLU-PRO (5-shot)	31.97

Mergekit Config

models:
  - model: "/Users/jsarnecki/opt/Workspace/arcee-ai/Llama-3.1-SuperNova-Lite"
    parameters:
      weight: 1
      density: 1

  - model: "/Users/jsarnecki/opt/Workspace/arcee-ai/Llama-3.1-SuperNova-Lite"
    parameters:
      weight: 1
      density: 1

merge_method: ties
base_model: "/Users/jsarnecki/opt/Workspace/meta-llama/Llama-3.1-8B"
parameters:
  density: 1
  normalize: true
  int8_mask: true
dtype: bfloat16

Undi95

Oct 3

I'm toying with that, Jsarnecki spoke about that to me too, for very specific task it seem to follow less instruction (on what it was trained at first), that's the feedback I got from outside, otherwise seem to make the model more solid (imo) and can even make the talking better (but more average/flowery). I didn't made enough try to be sure about anything yet.

Crystalcareai

Oct 3

•

edited Oct 3

Merging with a previous checkpoint is a wonderful regularization technique. But definitely more research is needing on how/when/why to merge and why exactly it can work as well as it does.

Joseph717171
/

Llama-3.1-SuperNova-8B-Lite_TIES_with_Base

Explain these Benchmark Results

Joseph717171/Llama-3.1-SuperNova-8B-Lite_TIES_with_Base's Benchmarks

arcee-ai/Llama-3.1-SuperNova-Lite's Benchmarks

Mergekit Config