amd
/

Meta-Llama-3.1-8B-Instruct-FP8-KV

Model card Files Files and versions Community

luow-amd commited on Sep 9

Commit

3aa66e3

•

1 Parent(s): 6a6e7f7

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -6,7 +6,7 @@ license: llama3.1
   This model was created by applying [Quark](https://quark.docs.amd.com/latest/index.html) with calibration samples from Pile dataset.
 - ## Quantization Stragegy
   - ***Quantized Layers***：All linear layers excluding "lm_head"
-  - ***Weight***：FP8 symmetric per-tensor
   - ***Activation***: FP8 symmetric per-tensor
   - ***KV Cache***: FP8 symmetric  per-tensor
 - ## Quick Start

   This model was created by applying [Quark](https://quark.docs.amd.com/latest/index.html) with calibration samples from Pile dataset.
 - ## Quantization Stragegy
   - ***Quantized Layers***：All linear layers excluding "lm_head"
+  - ***Weight***: FP8 symmetric per-tensor
   - ***Activation***: FP8 symmetric per-tensor
   - ***KV Cache***: FP8 symmetric  per-tensor
 - ## Quick Start