How much VRAM does this need to run locally?

by XeIaso - opened Sep 11

Discussion

XeIaso

Sep 11

It should be at least 30 GB of vram, right?

poeroz

Natural Language Processing Group, Institute of Computing Technology, Chinese Academy of Science org Sep 11

It needs ~20GB of VRAM. We tried inference on the RTX 3090 and it worked well.

cesinsingapore

Sep 12

RTX 3070 TI laptop with 8gigs is not working while running the python -m omni_speech.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40000 --worker http://localhost:40000 --model-path Llama-3.1-8B-Omni --model-name Llama-3.1-8B-Omni --s2s

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment