-
Notifications
You must be signed in to change notification settings - Fork 9.7k
Open
Description
I have a bad 4GB GPU, but it looks like this is almost enough to generate big images using this UI (a111's UI can't even start processing of something like this.)
After 100% processing of 1920*1080 image in KSampler I have error messages:
[The latest (today) test version of this ui]
--normalvram.
Warning: Ran out of memory when regular VAE decoding, retrying with tiled VAE decoding.
!!! Exception during processing !!!
...
CUDA out of memory. Tried to allocate 1.98 GiB. GPU 0 has a total capacty of 4.00 GiB of which 0 bytes is free. Of the allocated memory 2.79 GiB is allocated by PyTorch, and 556.89 MiB is reserved by PyTorch but unallocated.
...
CUDA out of memory. Tried to allocate 256.00 MiB. GPU 0 has a total capacty of 4.00 GiB of which 0 bytes is free. Of the allocated memory 3.11 GiB is allocated by PyTorch, and 236.83 MiB is reserved by PyTorch but unallocated.
--lowvram (queue with 3 elements)
100%|██████████████████████████████████████████████████████████████████████████████████| 22/22 [07:05<00:00, 19.35s/it]
!!! Exception during processing !!!
Traceback (most recent call last):
File "D:\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\execution.py", line 151, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\execution.py", line 81, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\execution.py", line 74, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\nodes.py", line 241, in decode
return (vae.decode(samples["samples"]), )
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\comfy\sd.py", line 626, in decode
pixel_samples[x:x+batch_number] = torch.clamp((self.first_stage_model.decode(samples) + 1.0) / 2.0, min=0.0, max=1.0).cpu().float()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\comfy\ldm\models\autoencoder.py", line 94, in decode
dec = self.decoder(z)
^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable_nightly_pytorch\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable_nightly_pytorch\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\comfy\ldm\modules\diffusionmodules\model.py", line 734, in forward
h = nonlinearity(h)
^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\comfy\ldm\modules\diffusionmodules\model.py", line 40, in nonlinearity
return x*torch.sigmoid(x)
^^^^^^^^^^^^^^^^
RuntimeError: CUDA error: the launch timed out and was terminated
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
Prompt executed in 448.38 seconds
Exception in thread Thread-1 (prompt_worker):
Traceback (most recent call last):
File "threading.py", line 1038, in _bootstrap_inner
File "threading.py", line 975, in run
File "D:\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\main.py", line 88, in prompt_worker
comfy.model_management.soft_empty_cache()
File "D:\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\comfy\model_management.py", line 554, in soft_empty_cache
torch.cuda.empty_cache()
File "D:\ComfyUI_windows_portable_nightly_pytorch\python_embeded\Lib\site-packages\torch\cuda\memory.py", line 164, in empty_cache
torch._C._cuda_emptyCache()
RuntimeError: CUDA error: the launch timed out and was terminated
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
About --normalvram mode: I don't know how exactly the image is processed in VAE Decoder, but if it were possible to clear some memory after KSampler processing, it would allow everyone to generate larger images than usual.
The reasons of the failure seem to be different for normal and low memory modes. So I can't say nothing about --lowram mode.
Metadata
Metadata
Assignees
Labels
No labels