|
# superhot-30b-8k-4bit-128g-safetensors |
|
|
|
Merged base LLaMA and LoRA with this: https://github.com/tloen/alpaca-lora |
|
Base LLaMA 30B: https://huggingface.co/huggyllama/llama-30b |
|
SuperCOT 30B 8k LoRA: https://huggingface.co/kaiokendev/superhot-30b-8k-no-rlhf-test |
|
|
|
``` sh |
|
BASE_MODEL=huggyllama_llama-30b LORA=kaiokendev_superhot-30b-8k-no-rlhf-test python export_hf_checkpoint.py |
|
``` |
|
|
|
Quantized with AutoGPTQ: https://github.com/PanQiWei/AutoGPTQ |
|
|
|
``` sh |
|
python quant_with_alpaca.py --pretrained_model_dir superhot-30b-8k-safetensors --quantized_model_dir superhot-30b-8k-4bit-128g-safetensors --bits 4 --group_size 128 --desc_act --num_samples 256 --save_and_reload |
|
``` |
|
|