cudaDeviceSynchronize used in SDK filters requires all CUDA streams to complete

----------------------------------------------------------------------------------------------------
| Required Info                         |                                                                |
|---------------------------------|------------------------------------------- |
| Camera Model                       | D400 |
| Firmware Version                   | N/a |
| Operating System & Version |   Linux, Windows   |
| Kernel Version (Linux Only)    |  All                                         |
| Platform                                 | All  |
| SDK Version                            |  2.54.2                          |
| Language                            |  C and C++                          |
| Segment			|                            |

### Issue Description

The Realsense SDK uses `cudaDeviceSynchronize` to synchronize GPU operations. This takes place in the color conversion functions and alignment filter. The issue with using `cudaDeviceSynchronize` is that it will wait for all operations on all streams to complete: https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#explicit-synchronization. From my understanding of the code, it isn't necessary for the Realsense SDK to wait on all streams to complete -- but rather just the one on which the filtering operations are executing. Please correct me if I am wrong. 🙂 

The user may be running CUDA code in separate CUDA streams in their application and the `cudaDeviceSynchronize` call will wait for those operations to finish if they are executing concurrently. A solution to this problem would be to either place CUDA operations in the Realsense SDK on a separate stream, or use `cudaStreamSynchronize` with an argument of 0 to only synchronize the default stream which is used by the Realsense SDK. Either solution would allow SDK CUDA operations to not block until other streams complete. The latter is simpler to implement and would not change the stream users expect the Realsense SDK to use.

I am happy to help contribute the changes if the Realsense team is interested; I searched for similar issues and could not find related issues.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

cudaDeviceSynchronize used in SDK filters requires all CUDA streams to complete #12680

Issue Description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Required Info
Camera Model	D400
Firmware Version	N/a
Operating System & Version	Linux, Windows
Kernel Version (Linux Only)	All
Platform	All
SDK Version	2.54.2
Language	C and C++
Segment

cudaDeviceSynchronize used in SDK filters requires all CUDA streams to complete #12680

Description

Issue Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions