CUDA Error: Invalid Argument when annotating large images

Hey,

I ran into the same problem as described in [this](https://github.com/coendevente/SlicerNNInteractive/issues/37) issue in the Slicer Plugin Repo. Except other than them I am not running nninteractive in a docker container, but locally.

**The problem**
For large images the following error is thrown when annotating the image.

```
File ~/PycharmProjects/napari-nninteractive/nnInteractive/nnInteractive/inference/inference_session.py:140, in nnInteractiveInferenceSession._initialize_interactions(self=<nnInteractive.inference.inference_session.nnInteractiveInferenceSession object>, image_torch=tensor([[[[ 0.5962,  0.3944,  0.4676,  ..., -0.3...72,  0.9848,  ..., -0.3240, -0.3240, -0.3240]]]]))
    138     print(f'Initialize interactions. Pinned: {self.use_pinned_memory}')
    139 # Create the interaction tensor based on the target shape.
--> 140 self.interactions = torch.zeros(
        self.interactions = None
        self = <nnInteractive.inference.inference_session.nnInteractiveInferenceSession object at 0x7afebd4543e0>
        image_torch = tensor([[[[ 0.5962,  0.3944,  0.4676,  ..., -0.3240, -0.3240, -0.3240],
          [ 0.9328,  0.7784,  0.5041,  ..., -0.3240, -0.3240, -0.3240],
          [ 1.1058,  1.0088,  0.6511,  ..., -0.3240, -0.3240, -0.3240],
          ...,
         [prints the whole matrix]
          ...,
          [ 0.7830,  1.0863,  1.1597,  ..., -0.3240, -0.3240, -0.3240],
          [ 0.7449,  1.0398,  1.1931,  ..., -0.3240, -0.3240, -0.3240],
          [ 0.6084,  0.8872,  0.9848,  ..., -0.3240, -0.3240, -0.3240]]]])
        torch.float16 = torch.float16
        self.use_pinned_memory = True
        self.device = device(type='cuda', index=0)    141     (7, *image_torch.shape[1:]),
    142     device='cpu',
    143     dtype=torch.float16,
    144     pin_memory=(self.device.type == 'cuda' and self.use_pinned_memory)
    145 )

AcceleratorError: CUDA error: invalid argument
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
```

I tried it for many different image sizes and the problem only occurs for images larger than around **600MB**.
But it can not be a matter of too little memory. I'm using the 5090 with 32GB of VRAM and my system has 120GB of RAM. It also ran without any problems on our other machines with way less RAM and VRAM. 

Do you guys have any ideas what the problem might be? Could it be the CUDA version (as you recommend 12.6, while I need 12.8 for the 5090)?

**Environment Information**
  Operating System: Ubuntu 24.04
  CUDA Version: 12.8
  Python Version: 3.12
  GPU: 5090
  Memory: 120GB


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CUDA Error: Invalid Argument when annotating large images #18

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

CUDA Error: Invalid Argument when annotating large images #18

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions