superhot-30b-8k-4bit-128g-safetensors

Merged base LLaMA and LoRA with this: https://github.com/tloen/alpaca-lora Base LLaMA 30B: https://huggingface.co/huggyllama/llama-30b SuperCOT 30B 8k LoRA: https://huggingface.co/kaiokendev/superhot-30b-8k-no-rlhf-test

BASE_MODEL=huggyllama_llama-30b LORA=kaiokendev_superhot-30b-8k-no-rlhf-test python export_hf_checkpoint.py

Quantized with AutoGPTQ: https://github.com/PanQiWei/AutoGPTQ

python quant_with_alpaca.py --pretrained_model_dir superhot-30b-8k-safetensors --quantized_model_dir superhot-30b-8k-4bit-128g-safetensors --bits 4 --group_size 128 --desc_act --num_samples 256 --save_and_reload