VRAM Estimation
Hello,
I'd like to ask about how much VRAM this model requires to run. I'm running a 3060 and 32gb of system ram, however, this doesn't seem to be enough to complete a generation.
I suspect this is due to stable_audio_tools, and optimizing the final output function could rectify this issue, because generation runs flawlessly, however it fails before returning a final output. Has anybody else with similar hardware experienced this issue? Have you gotten yours running?
For reference, here are my logs. This is from the example provided in README.md
Traceback (most recent call last):
File "/home/personontheinternet/Development/stableaudioUI/main.py", line 24, in <module>
output = generate_diffusion_cond(
File "/home/personontheinternet/Documents/miniconda3/envs/exllama/lib/python3.10/site-packages/stable_audio_tools/inference/generation.py", line 247, in generate_diffusion_cond
sampled = model.pretransform.decode(sampled)
File "/home/personontheinternet/Documents/miniconda3/envs/exllama/lib/python3.10/site-packages/stable_audio_tools/models/pretransforms.py", line 70, in decode
decoded = self.model.decode_audio(z, chunked=self.chunked, iterate_batch=self.iterate_batch, **kwargs)
File "/home/personontheinternet/Documents/miniconda3/envs/exllama/lib/python3.10/site-packages/stable_audio_tools/models/autoencoders.py", line 513, in decode_audio
return self.decode(latents, **kwargs)
File "/home/personontheinternet/Documents/miniconda3/envs/exllama/lib/python3.10/site-packages/stable_audio_tools/models/autoencoders.py", line 334, in decode
decoded.append(self.decoder(latents[i:i+1]))
File "/home/personontheinternet/Documents/miniconda3/envs/exllama/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/personontheinternet/Documents/miniconda3/envs/exllama/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/home/personontheinternet/Documents/miniconda3/envs/exllama/lib/python3.10/site-packages/stable_audio_tools/models/autoencoders.py", line 191, in forward
return self.layers(x)
File "/home/personontheinternet/Documents/miniconda3/envs/exllama/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/personontheinternet/Documents/miniconda3/envs/exllama/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/home/personontheinternet/Documents/miniconda3/envs/exllama/lib/python3.10/site-packages/torch/nn/modules/container.py", line 217, in forward
input = module(input)
File "/home/personontheinternet/Documents/miniconda3/envs/exllama/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/personontheinternet/Documents/miniconda3/envs/exllama/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/home/personontheinternet/Documents/miniconda3/envs/exllama/lib/python3.10/site-packages/stable_audio_tools/models/autoencoders.py", line 114, in forward
return self.layers(x)
File "/home/personontheinternet/Documents/miniconda3/envs/exllama/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/personontheinternet/Documents/miniconda3/envs/exllama/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/home/personontheinternet/Documents/miniconda3/envs/exllama/lib/python3.10/site-packages/torch/nn/modules/container.py", line 217, in forward
input = module(input)
File "/home/personontheinternet/Documents/miniconda3/envs/exllama/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/personontheinternet/Documents/miniconda3/envs/exllama/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/home/personontheinternet/Documents/miniconda3/envs/exllama/lib/python3.10/site-packages/stable_audio_tools/models/autoencoders.py", line 60, in forward
x = self.layers(x)
File "/home/personontheinternet/Documents/miniconda3/envs/exllama/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/personontheinternet/Documents/miniconda3/envs/exllama/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/home/personontheinternet/Documents/miniconda3/envs/exllama/lib/python3.10/site-packages/torch/nn/modules/container.py", line 217, in forward
input = module(input)
File "/home/personontheinternet/Documents/miniconda3/envs/exllama/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/personontheinternet/Documents/miniconda3/envs/exllama/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/home/personontheinternet/Documents/miniconda3/envs/exllama/lib/python3.10/site-packages/stable_audio_tools/models/blocks.py", line 337, in forward
x = snake_beta(x, alpha, beta)
File "/home/personontheinternet/Documents/miniconda3/envs/exllama/lib/python3.10/site-packages/stable_audio_tools/models/blocks.py", line 302, in snake_beta
return x + (1.0 / (beta + 0.000000001)) * pow(torch.sin(x * alpha), 2)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1024.00 MiB. GPU
Thank you for releasing this model. I honestly didn't expect it to see the light of day. I'm glad to have been proven wrong :)
It works for me with 32 GB of RAM and a 12 GB GPU (3080 ti)
I got a 8gb 1070ti, doesn't appear to be happening/
My total VRAM usage is at 12.2GB, including Windows display.
i got it working on Nvidia 1060 3Gb in comfyui :D
is it super slow for anyone else? wonder if theres ways to speed it up... my gpu tested: A10G 24GB VRAM
Same config (3060 12 GB VRAM and 32GB RAM) and it first uses like 6 GB of VRAM, then, at the end, jumps to 12 GB VRAM and 2 GB of normal RAM - it slows down at this point and CPU is working, but generate sound at the end.
run in half-precision