Text Generation
Transformers
PyTorch
mpt
Composer
MosaicML
llm-foundry
custom_code
text-generation-inference

Optimal 4090 settings? System has 96GB of ram (I9-13900k)

#35
by cleverest - opened

What is the best I can set it up, with these specs? I've noticed chat only gives a few lines and then always stops early...how to fix?

the 4090 has 24GB of VRAM which should be plenty. If it is always stopping early, have you passed max_new_tokens? What are your settings?

cleverest changed discussion title from Optimal 4090 settings? System has 96GB of ram (I9-13700k) to Optimal 4090 settings? System has 96GB of ram (I9-13900k)

I'm just using the defaults in Oobabooga. (0.7 temp)... 200 max tokens are the default. Changing it to 400 doesn't fix it, still very clipped off early. :-( What other settings do you want?

the 4090 has 24GB of VRAM which should be plenty. If it is always stopping early, have you passed max_new_tokens? What are your settings?

Hey, I too have the same specs but the kernel crashes every time I try to load the model, can you tell me what changes you made to run the model on 4090 with 24GB vRAM.
I had set the max_new_tokens as 1024 and used float16 quantization

I just kept the defaults and it runs. I use the settings he posted for the config file to make it load without error..nake sure groupsize is none IN THE CONFIG, not the UI (it lies). I also have 96gb RAM...you?

I also have 96gb RAM...you? Ohh I have a 24 GB RAM, maybe that's why
I was able to run it on A100 with 40 GB RAM

sam-mosaic changed discussion status to closed

Sign up or log in to comment