safetensors
Why did you delete the safetensors version? It makes it substantially less convenient to clone now.
I moved it to its own branch because It uses group_size + desc_act and therefore either produces gibberish, or runs extremely slow, for the majority of users. Also having two models in one repo doesn't work well now that there's quantize_config.json
for AutoGPTQ.
It's not gone, it's just in a separate branch. Just as easy to clone as it was before.
If using git, then do:
git clone -b actorder https://huggingface.co/TheBloke/vicuna-7B-1.1-GPTQ-4bit-128g
If using text-generation-webui to download in the UI, type:
TheBloke/vicuna-7B-1.1-GPTQ-4bit-128g:actorder
And if using text-generation-webui's download-model.py
on the command line, do:
python download-model.py TheBloke/vicuna-7B-1.1-GPTQ-4bit-128g --branch actorder
Ah right, apologies I missed that. Didn't even bother checking other branches on HF. FWIW on exllama, the perf difference is very minimal.
Anyway, thanks for all your work! We're Patreons :)
No worries - I really should have mentioned all that in the README! I've been planning to revamp these older repos but I've not had time yet. So I've just been tidying up issues like this as people raise them but haven't dealt with them properly yet.
And yeah fair enough, exllama is amazing. Now I've learned quite how good it is I'm planning to start adding more GPTQ choice, like having an act_order + desc_act model for every GPTQ repo.
Out of interest, what is it that you like about Vicuna 1.1 compared to all the models that have come out since? Wizard-Vicuna-Uncensored, WizardLM-Uncensored, Nous-Hermes, etc? I'd rather thought that it had been superseded by now.
Thanks very much for your support!