Spaces:
Running
Fix the app
So everytime I try to generate a video, it takes 1.7 seconds and I get nothing. Can you please fix this?
Hi,
@RefreshedCyberJohn
It seems that the app was not working properly due to the CUDA OOM error. I've just restarted the app, so I think it should start working again in about 30 minutes. Thanks.
You're welcome. ;)
It's still doing it, now it says 2.0 or so.
Hmm, I'm not sure what's going on, but we are getting this error:
Traceback (most recent call last):
File "/home/user/.pyenv/versions/3.9.13/lib/python3.9/site-packages/gradio/routes.py", line 247, in run_predict
output = await app.blocks.process_api(
File "/home/user/.pyenv/versions/3.9.13/lib/python3.9/site-packages/gradio/blocks.py", line 640, in process_api
predictions, duration = await self.call_function(fn_index, processed_input)
File "/home/user/.pyenv/versions/3.9.13/lib/python3.9/site-packages/gradio/blocks.py", line 555, in call_function
prediction = await anyio.to_thread.run_sync(
File "/home/user/.pyenv/versions/3.9.13/lib/python3.9/site-packages/anyio/to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/home/user/.pyenv/versions/3.9.13/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "/home/user/.pyenv/versions/3.9.13/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 867, in run
result = context.run(func, *args)
File "/home/user/app/model.py", line 1241, in run_with_translation
frames = self.run(text, seed, only_first_stage,image_prompt)
File "/home/user/app/model.py", line 1178, in run
set_random_seed(seed)
File "/home/user/.pyenv/versions/3.9.13/lib/python3.9/site-packages/SwissArmyTransformer/arguments.py", line 429, in set_random_seed
torch.manual_seed(seed)
File "/home/user/.pyenv/versions/3.9.13/lib/python3.9/site-packages/torch/random.py", line 40, in manual_seed
torch.cuda.manual_seed_all(seed)
File "/home/user/.pyenv/versions/3.9.13/lib/python3.9/site-packages/torch/cuda/random.py", line 113, in manual_seed_all
_lazy_call(cb, seed_all=True)
File "/home/user/.pyenv/versions/3.9.13/lib/python3.9/site-packages/torch/cuda/__init__.py", line 156, in _lazy_call
callable()
File "/home/user/.pyenv/versions/3.9.13/lib/python3.9/site-packages/torch/cuda/random.py", line 111, in cb
default_generator.manual_seed(seed)
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
I'll reboot the Space, but the same error could occur again.
@chris-rannou @akhaliq Any idea why this error occurs and how to fix it?
Okay, it's working now.
UPDATE: Now when I try to render, it says 0/193.0 and immediately stops after like a second. Fix it again please.
I think there are some glitches in HF Spaces now. Some other Spaces are not working properly either.
It's doing it again
Thanks. I restarted the Space. It will be up again in about 30 minutes.
It's still doing it
Hmm, factory reboot doesn't seem to be working.
Man...
A factory reboot is ongoing on the space, did you try setting the environment variable CUDA_LAUNCH_BLOCKING=1
to try and get more details about the error ?
@chris-rannou
Thanks! It's working now.
did you try setting the environment variable CUDA_LAUNCH_BLOCKING=1 to try and get more details about the error ?
Ah, sorry. No, I haven't tried it. I know the log said to do it, but when I pressed the "Restart this Space" button or "Factory reboot this Space" button, the build ended unexpectedly fast and the log from last time was still showing. So I thought the Space was not actually rebooting and decided to ask in the forum. But I should have checked it first just in case.
After a successful factory reboot the space seems to be working now
@hysts you were right there was an issue with the rebooting dues to this space specific resources assignment
I see. Thanks!
Okay, I'll try it.
Success! It's working.
Now it needs to be fixed again. It stops quickly after I press "generate".
Hi,
@RefreshedCyberJohn
Sorry for the late reply. I've been busy and away from HF Hub for a while and I just noticed your message. I factory-rebooted the Space and it seems working properly now.
:O Broken again! Oct 12. It stops after 5 seconds and shows just a cam. Nothing happens nearly 2 hours latter either.
Hi,
@BladedSupernova
Thanks for reporting this. I factory-rebooted this Space just now. So I think it will be back up in 30-40 minutes.
(Sorry, but I'm not feeling very well today, so I'm not going to wait to see if the Space will be restarted successfully, but I think it will be fine. I'll check it again tomorrow.)
Looks like it's working properly now.