File size: 1,004 Bytes
c070474
 
 
 
 
 
 
0c18e26
 
 
 
 
 
 
 
468d10e
0c18e26
 
 
 
 
468d10e
 
0c18e26
468d10e
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
---
license: llama3.1
language:
- en
base_model:
- meta-llama/Meta-Llama-3.1-8B-Instruct
---
# Meta-Llama-3.1-8B-Instruct-TurboMind-AWQ-4bit
- Model creator: [Meta-Llama](https://huggingface.co/meta-llama)
- Original model: [Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct)
 
 
## Overview
This repository contains a 4-bit AWQ version of **Meta-Llama-3.1-8B-Instruct**, optimized for the LMDeploy TurboMindEngine. 
The model is designed to provide efficient and accurate performance with reduced computational requirements.

## Model Details
- **Model Name**: Meta-Llama-3.1-8B-Instruct-TurboMind-AWQ-4bit
- **Base Model**: meta-llama/Meta-Llama-3.1-8B-Instruct
- **Quantization**: 4-bit AWQ
- **Engine**: LMDeploy TurboMindEngine


```bash
lmdeploy lite auto_awq \
   $HF_MODEL \
  --calib-dataset 'ptb' \
  --calib-samples 128 \
  --calib-seqlen 2048 \
  --w-bits 4 \
  --w-group-size 128 \
  --batch-size 10 \
  --search-scale True \
  --work-dir $WORK_DIR
```