tags: | |
- fp8 | |
- vllm | |
Run with `vllm==0.6.2` on 4xH100: | |
``` | |
vllm serve neuralmagic/Llama-3.2-90B-Vision-Instruct-FP8-dynamic --enforce-eager --max-num-seqs 16 --tensor-parallel-size 4 | |
``` |
tags: | |
- fp8 | |
- vllm | |
Run with `vllm==0.6.2` on 4xH100: | |
``` | |
vllm serve neuralmagic/Llama-3.2-90B-Vision-Instruct-FP8-dynamic --enforce-eager --max-num-seqs 16 --tensor-parallel-size 4 | |
``` |