mgoin's picture
Create README.md
21a28bf verified
|
raw
history blame
No virus
193 Bytes
metadata
tags:
  - fp8
  - vllm

Run with vllm==0.6.2 on 4xH100:

vllm serve neuralmagic/Llama-3.2-90B-Vision-Instruct-FP8-dynamic --enforce-eager --max-num-seqs 16 --tensor-parallel-size 4