New `needle_map.CompactMap()` implementation for reduced memory usage #6842

proton-lisandro-pin · 2025-06-05T14:45:07Z

What problem are we solving?

How are we solving the problem?

This PR introduces a new CompactMap() implementation for in-memory map of needle indices, optimized for memory usage.

The idea behind it is replacing the bucket+overflow approach with a map of sorted indices segments, which are in turn accessed through binary search. This guarantees a best-case scenario (ordered inserts/updates) of O(1) and a worst case scenario of O(log n) runtime; notably, memory usage is unaffected by index ordering.

Note that even at O(log n), the clock time for both reads and writes is very low, with weed benchmark actually running slightly faster than the existing implementation.

The end result is 20-30% improvement in memory usage, compared with v3.89. See #6804 for more details and benchmark results.

How is the PR tested?

Every single existing unit and integration test should pass without issues; the existing API and behavior for CompactMap() is unchanged.

Checks

I have added unit tests if possible.
I will add related wiki document changes and link to this PR after merging.

This slightly complicates the code, but makes a **massive** difference in memory efficiency - preliminary results show a ~30% reduction in heap usage, with no measurable performance impact otherwise.

chrislusf · 2025-06-05T15:04:56Z

Could you keep the old compact map implementation and tests in separate files, just for reference and easier comparison.

proton-lisandro-pin · 2025-06-05T15:17:29Z

Could you keep the old compact map implementation and tests in separate files, just for reference and easier comparison.

As in, compact_map_old.go and compact_map_old_test.go?

chrislusf · 2025-06-05T16:13:32Z

As in, compact_map_old.go and compact_map_old_test.go?

That would work. Or name it as CompactMapWithOverflow, since old will change over time.

Also, the test can have a benchmark to compare the two implementations.

proton-lisandro-pin · 2025-06-05T17:11:57Z

As in, compact_map_old.go and compact_map_old_test.go?

That would work. Or name it as CompactMapWithOverflow, since old will change over time.

Done!

Also, the test can have a benchmark to compare the two implementations.

I've uploaded benchmark results on #6804 (comment).

chrislusf · 2025-06-05T17:28:25Z

Also, the test can have a benchmark to compare the two implementations.

I've uploaded benchmark results on #6804 (comment).

I was thinking about https://pkg.go.dev/testing#hdr-Benchmarks

chrislusf · 2025-06-05T17:32:37Z

As in, compact_map_old.go and compact_map_old_test.go?

That would work. Or name it as CompactMapWithOverflow, since old will change over time.

Done!

I prefer no go build tags for this. You can put it in a sub directory if you don't want to deal with naming conflicts.

proton-lisandro-pin · 2025-06-05T17:51:28Z

I prefer no go build tags for this. You can put it in a sub directory if you don't want to deal with naming conflicts.

Done. Note that i still had to tweak a couple lines, as compact_map.go shares a lot of code with the rest of weed/storage/needle_map 😞

proton-lisandro-pin added 4 commits June 3, 2025 15:49

Rework needle_map.CompactMap() to maximize memory efficiency.

1cc58ff

Use a memory-efficient structure for CompactMap needle value entries.

4eaed7f

This slightly complicates the code, but makes a **massive** difference in memory efficiency - preliminary results show a ~30% reduction in heap usage, with no measurable performance impact otherwise.

Clean up type for CompactMap chunk IDs.

ba4b91b

Add a small comment description for CompactMap().

0855625

Add the old version of CompactMap() for comparison purposes.

6309691

proton-lisandro-pin force-pushed the new_needle_map branch from c4cb9af to 6309691 Compare June 5, 2025 17:51

chrislusf merged commit bed0a64 into seaweedfs:master Jun 5, 2025
7 of 13 checks passed

proton-lisandro-pin deleted the new_needle_map branch June 6, 2025 10:10

proton-lisandro-pin mentioned this pull request Jun 11, 2025

Volume server OOMs on large number of concurrent writes #6804

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

New `needle_map.CompactMap()` implementation for reduced memory usage #6842

New `needle_map.CompactMap()` implementation for reduced memory usage #6842

Uh oh!

proton-lisandro-pin commented Jun 5, 2025 •

edited

Loading

Uh oh!

chrislusf commented Jun 5, 2025

Uh oh!

proton-lisandro-pin commented Jun 5, 2025

Uh oh!

chrislusf commented Jun 5, 2025

Uh oh!

proton-lisandro-pin commented Jun 5, 2025

Uh oh!

chrislusf commented Jun 5, 2025

Uh oh!

chrislusf commented Jun 5, 2025

Uh oh!

proton-lisandro-pin commented Jun 5, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

New needle_map.CompactMap() implementation for reduced memory usage #6842

New needle_map.CompactMap() implementation for reduced memory usage #6842

Uh oh!

Conversation

proton-lisandro-pin commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem are we solving?

How are we solving the problem?

How is the PR tested?

Checks

Uh oh!

chrislusf commented Jun 5, 2025

Uh oh!

proton-lisandro-pin commented Jun 5, 2025

Uh oh!

chrislusf commented Jun 5, 2025

Uh oh!

proton-lisandro-pin commented Jun 5, 2025

Uh oh!

chrislusf commented Jun 5, 2025

Uh oh!

chrislusf commented Jun 5, 2025

Uh oh!

proton-lisandro-pin commented Jun 5, 2025

Uh oh!

Uh oh!

Uh oh!

New `needle_map.CompactMap()` implementation for reduced memory usage #6842

New `needle_map.CompactMap()` implementation for reduced memory usage #6842

proton-lisandro-pin commented Jun 5, 2025 •

edited

Loading