stan-hua
/

Llama-3.1-8B-Instruct-LC-RTN-W4A16-KV4

Push folder to HuggingFace Hub

5104621 verified 1 day ago

227 Bytes

	DEFAULT_stage:
	DEFAULT_modifiers:
	QuantizationModifier:
	ignore: [lm_head]
	targets: Linear
	scheme: W4A16
	kv_cache_scheme: {num_bits: 4, type: int, symmetric: true, strategy: tensor, dynamic: false}