Reduce allocations during readNativeFrames, leading to ~15-20% performance improvement #157
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This change reduces the number of allocations in
readNativeFrames
. Instead of allocating a slice for each pixel's samples, a single flat buffer slice is allocated upfront of the sizepixelsPerFrame*samplesPerPixel
. Later, ranges in that slice are referred to in the larger 2D slice. This leads to there only being two calls tomake
, leading to significant performance gains.On my machine running
make bench-diff
:We see similar results in the GitHub action benchmark.
Presumably the percentage performance gains would be even higher for DICOMs with more Native PixelData (e.g. more frames, pixels per frame, samples per pixel, etc).
This helps address #161.