What hardware do I need for reasonable performance?

by TS0001 - opened Jun 5, 2023

Jun 5, 2023

Awesome work @TheBloke ! Thank you.

I have this running on runpod.io with Text Generation UI, on an A100 with 80 GB VRAM and 125 GB RAM 16 vCPU. Performance is quite slow. I'm wondering if anyone has it running with reasonable performance, and if so, on what hardware?

Thanks!

mancub

Jun 5, 2023

I think the issue is AutoGPTQ which is slow, but I don't know enough about it, only what I've been reading people say.

I get ~2 t/s on my 3090 with this model which I consider reasonable for the setup (WSL2). :)

prudant

Jun 30, 2023

@mancub how much vram does have your 3090? thanks

prudant

Jun 30, 2023

@TS0001 how much token/sec do you get on the A100? thanks

viktor-ferenczi

Jul 2, 2023

What is the fastest way to run this model on GPU?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment