Rewrite needle_map.CompactMap()
for more efficient memory usage
#6813
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What problem are we solving?
#6804
How are we solving the problem?
CompactMap()
relies on fixed-size storage buckets of 10,000 elements, and a separate storage for overflow (i.e. out of order) needle IDs. The problem is that most of that fixed storage bucket space ends up allocated but never used; this is particularly true for out-of-order needle IDs, but happens also with perfectly ordered needle writes.This MR reworks
CompactMap()
so:copy()
instead of ad-hoc loops.if
andfor
sections are unrolled, whenever possible.The result is a dramatic improvement in memory usage, at the cost of slightly increased memory fragmentation - whose impact will be entirely platform-dependent.
#6804 has more details, but this MR improves memory usage of
weed
processes by ~95% in best-case scenarios, and ~99% in worst-case scenarios, with no measurable performance impact.How is the PR tested?
Every single existing unit and integration test should pass without issues; the existing API and behavior for
CompactMap()
is unchanged.Checks