latimar's picture
Add PPL data
487b963 verified
|
raw
history blame
1.42 kB

Phind-CodeLlama-34B-v2 EXL2

Weights of Phind-CodeLlama-34B-v2 converted to EXL2 format.

Each separate quant is in a different branch, like in The Bloke's GPTQ repos.

export BRANCH=5_0-bpw-h8
git clone --single-branch --branch ${BRANCH} https://huggingface.co/latimar/Phind-Codellama-34B-v2-exl2

There are the following branches:

5_0-bpw-h8
4_625-bpw-h6
4_125-bpw-h6
2_75-bpw-h6
2_55-bpw-h6
  • Calibration dataset used for conversion: wikitext-v2
  • Evaluation dataset used to calculate perplexity: wikitext-v2
  • PPL max seq. length used: 1792 (2048 with 5.0-bpw-h8 causes OOM on RTX 4090 when evaluating ppl, so had to go down a bit)

| BPW | Perplexity | File Size (Gb) | ---------------------------------------------| | 2.55-h6 | 15.0901 | 10.56 | | 2.75-h6 | 13.6153 | 11.33 | | 4.125-h6 | 6.8095 | 16.65 | | 4.625-h6 | 6.7992 | 18.58 | | 5.0-h8 | 6.7785 | 20.09 |