File size: 1,004 Bytes
c070474 0c18e26 468d10e 0c18e26 468d10e 0c18e26 468d10e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
---
license: llama3.1
language:
- en
base_model:
- meta-llama/Meta-Llama-3.1-8B-Instruct
---
# Meta-Llama-3.1-8B-Instruct-TurboMind-AWQ-4bit
- Model creator: [Meta-Llama](https://huggingface.co/meta-llama)
- Original model: [Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct)
## Overview
This repository contains a 4-bit AWQ version of **Meta-Llama-3.1-8B-Instruct**, optimized for the LMDeploy TurboMindEngine.
The model is designed to provide efficient and accurate performance with reduced computational requirements.
## Model Details
- **Model Name**: Meta-Llama-3.1-8B-Instruct-TurboMind-AWQ-4bit
- **Base Model**: meta-llama/Meta-Llama-3.1-8B-Instruct
- **Quantization**: 4-bit AWQ
- **Engine**: LMDeploy TurboMindEngine
```bash
lmdeploy lite auto_awq \
$HF_MODEL \
--calib-dataset 'ptb' \
--calib-samples 128 \
--calib-seqlen 2048 \
--w-bits 4 \
--w-group-size 128 \
--batch-size 10 \
--search-scale True \
--work-dir $WORK_DIR
``` |