Aaron2599 commited on
Commit
468d10e
1 Parent(s): cb65b46

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -0
README.md ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - meta-llama/Meta-Llama-3.1-8B-Instruct
4
+ ---
5
+
6
+
7
+ # A 4bit AWQ version of meta-llama/Meta-Llama-3.1-8B-Instruct for the LMDeploy TurboMindEngine
8
+
9
+ ```
10
+ lmdeploy lite auto_awq \
11
+ $HF_MODEL \
12
+ --calib-dataset 'ptb' \
13
+ --calib-samples 128 \
14
+ --calib-seqlen 2048 \
15
+ --w-bits 4 \
16
+ --w-group-size 128 \
17
+ --batch-size 10 \
18
+ --search-scale True \
19
+ --work-dir $WORK_DIR
20
+ ```