--- base_model: https://huggingface.co/Phind/Phind-CodeLlama-34B-v2 inference: false license: llama2 model_creator: https://huggingface.co/Phind model_name: Phind-Codellama-34B-v2 model_type: llama quantized_by: latimar --- # Phind-CodeLlama-34B-v2 EXL2 Weights of [Phind-CodeLlama-34B-v2](https://huggingface.co/Phind/Phind-CodeLlama-34B-v2) converted to [EXL2](https://github.com/turboderp/exllamav2#exl2-quantization) format. Each separate quant is in a different branch, like in The Bloke's GPTQ repos. ``` export BRANCH=5_0-bpw-h8 git clone --single-branch --branch ${BRANCH} https://huggingface.co/latimar/Phind-Codellama-34B-v2-exl2 ``` There are the following branches: ``` 5_0-bpw-h8 4_625-bpw-h6 4_125-bpw-h6 3_8-bpw-h6 2_75-bpw-h6 2_55-bpw-h6 ``` * Calibration dataset used for conversion: [wikitext-v2](https://huggingface.co/datasets/wikitext/blob/refs%2Fconvert%2Fparquet/wikitext-2-v1/test/0000.parquet) * Evaluation dataset used to calculate perplexity: [wikitext-v2](https://huggingface.co/datasets/wikitext/blob/refs%2Fconvert%2Fparquet/wikitext-2-v1/validation/0000.parquet) * Calibration dataset used for conversion of `5_0-bpw-h8-ev`: [wizardLM-evol-instruct_70k](https://huggingface.co/datasets/WizardLM/WizardLM_evol_instruct_70k/blob/refs%2Fconvert%2Fparquet/default/train/0000.parquet) * Evaluation dataset used to calculate ppl for `Evol-Ins`: : [nikrosh-evol-instruct](https://huggingface.co/datasets/nickrosh/Evol-Instruct-Code-80k-v1/blob/refs%2Fconvert%2Fparquet/default/train/0000.parquet) * PPL max seq. length used: 1792 (2048 with 5.0-bpw-h8 causes OOM on RTX 4090 when evaluating ppl, so had to go down a bit) | BPW | PPL on Wiki | PPL on Evol-Ins | File Size (Gb) | | ----------- | ----------- | --------------- | -------------- | | 2.55-h6 | 15.0901 | | 10.56 | | 2.75-h6 | 13.6153 | | 11.33 | | 3.8-h6 | 6.8803 | | 15.37 | | 4.125-h6 | 6.8095 | | 16.65 | | 4.625-h6 | 6.7992 | 2.0499 | 18.58 | | 5.0-h8 | 6.7785 | 2.0448 | 20.09 | | 5.0-h8-ev | 6.9376 | 2.0430 | 20.09 |