--- license: llama3.1 language: - en base_model: - meta-llama/Meta-Llama-3.1-8B-Instruct --- # Meta-Llama-3.1-8B-Instruct-TurboMind-AWQ-4bit - Model creator: [Meta-Llama](https://huggingface.co/meta-llama) - Original model: [Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct) ## Overview This repository contains a 4-bit AWQ version of **Meta-Llama-3.1-8B-Instruct**, optimized for the LMDeploy TurboMindEngine. The model is designed to provide efficient and accurate performance with reduced computational requirements. ## Model Details - **Model Name**: Meta-Llama-3.1-8B-Instruct-TurboMind-AWQ-4bit - **Base Model**: meta-llama/Meta-Llama-3.1-8B-Instruct - **Quantization**: 4-bit AWQ - **Engine**: LMDeploy TurboMindEngine ```bash lmdeploy lite auto_awq \ $HF_MODEL \ --calib-dataset 'ptb' \ --calib-samples 128 \ --calib-seqlen 2048 \ --w-bits 4 \ --w-group-size 128 \ --batch-size 10 \ --search-scale True \ --work-dir $WORK_DIR ```