neuralmagic
/

Llama-3.2-90B-Vision-Instruct-FP8-dynamic

Text Generation

compressed-tensors

Model card Files Files and versions Community

Llama-3.2-90B-Vision-Instruct-FP8-dynamic / README.md

mgoin's picture

Create README.md

21a28bf verified about 2 months ago

|

193 Bytes

	---
	tags:
	- fp8
	- vllm
	---

	Run with `vllm==0.6.2` on 4xH100:
	```
	vllm serve neuralmagic/Llama-3.2-90B-Vision-Instruct-FP8-dynamic --enforce-eager --max-num-seqs 16 --tensor-parallel-size 4
	```