--- license: apache-2.0 --- # The Quantized LLaMA 3.1 70B Instruct Model Original Base Model: `meta-llama/Meta-Llama-3.1-70B`.
Link: [https://huggingface.co/meta-llama/Meta-Llama-3.1-70B](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B) ## Special Notice Please note that this is a relatively smaller model by setting `group_size=1024`.
For the standard `group_size=128` model, please check here, `shuyuej/Meta-Llama-3.1-70B-GPTQ`, [https://huggingface.co/shuyuej/Meta-Llama-3.1-70B-GPTQ](https://huggingface.co/shuyuej/Meta-Llama-3.1-70B-GPTQ). ## Quantization Configurations ``` "quantization_config": { "bits": 4, "checkpoint_format": "gptq", "damp_percent": 0.01, "desc_act": true, "group_size": 1024, "model_file_base_name": null, "model_name_or_path": null, "quant_method": "gptq", "static_groups": false, "sym": true, "true_sequential": true }, ``` ## Source Codes Source Codes: [https://github.com/vkola-lab/medpodgpt/tree/main/quantization](https://github.com/vkola-lab/medpodgpt/tree/main/quantization).