-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Description
Required Info | |
---|---|
Camera Model | D400 |
Firmware Version | N/a |
Operating System & Version | Linux, Windows |
Kernel Version (Linux Only) | All |
Platform | All |
SDK Version | 2.54.2 |
Language | C and C++ |
Segment |
Issue Description
The Realsense SDK uses cudaDeviceSynchronize
to synchronize GPU operations. This takes place in the color conversion functions and alignment filter. The issue with using cudaDeviceSynchronize
is that it will wait for all operations on all streams to complete: https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#explicit-synchronization. From my understanding of the code, it isn't necessary for the Realsense SDK to wait on all streams to complete -- but rather just the one on which the filtering operations are executing. Please correct me if I am wrong. 🙂
The user may be running CUDA code in separate CUDA streams in their application and the cudaDeviceSynchronize
call will wait for those operations to finish if they are executing concurrently. A solution to this problem would be to either place CUDA operations in the Realsense SDK on a separate stream, or use cudaStreamSynchronize
with an argument of 0 to only synchronize the default stream which is used by the Realsense SDK. Either solution would allow SDK CUDA operations to not block until other streams complete. The latter is simpler to implement and would not change the stream users expect the Realsense SDK to use.
I am happy to help contribute the changes if the Realsense team is interested; I searched for similar issues and could not find related issues.