-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Open and load numeric payload indices in single stage #6999
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
0e739b6
to
8c41c1d
Compare
8c41c1d
to
accddb4
Compare
accddb4
to
a871747
Compare
// Load through mutable numeric index structure | ||
let Some(mutable) = | ||
MutableNumericIndex::<T>::open_rocksdb_db_wrapper(db_wrapper.clone(), false)? | ||
else { | ||
// Column family doesn't exist, cannot load | ||
return Ok(None); | ||
}; | ||
// TODO(payload-index-remove-load): remove load when single stage open/load is implemented | ||
mutable.load()?; | ||
|
||
let InMemoryNumericIndex { | ||
map, | ||
histogram, | ||
points_count, | ||
max_values_per_point, | ||
point_to_values, | ||
} = mutable.into_in_memory_index(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: we now load the index here, right when we open. The load function has effectively moved into here.
This comment was marked as resolved.
This comment was marked as resolved.
Missing refactor for MutableNullIndex. Since we have the `create_if_missing` flag, we no longer have `open_if_exists` function, ergo, signature for `new_null_index` should not include an `Option`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to see this get simplified
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
lib/segment/src/index/field_index/index_selector.rs (1)
175-191
: Fix: Don’t drop a successfully created lookup index when range index is absent [duplicate of earlier review, still applicable]If
use_lookup
is true, butnumeric_new
yieldsNone
, the current earlyreturn Ok(None)
discards the lookup index. Compose both options and returnOk(None)
only when both are absent.Apply this diff:
let lookup = if use_lookup { Some(FieldIndex::IntMapIndex( self.map_new(field, create_if_missing)?, )) } else { None }; - let range = if use_range { - match self.numeric_new(field, create_if_missing)? { - Some(index) => Some(FieldIndex::IntIndex(index)), - None => return Ok(None), - } - } else { - None - }; - - Some(lookup.into_iter().chain(range).collect()) + let range = if use_range { + self.numeric_new(field, create_if_missing)? + .map(FieldIndex::IntIndex) + } else { + None + }; + + let mut indices: Vec<FieldIndex> = Vec::new(); + if let Some(l) = lookup { indices.push(l); } + if let Some(r) = range { indices.push(r); } + if indices.is_empty() { + return Ok(None); + } + return Ok(Some(indices));
🧹 Nitpick comments (1)
lib/segment/src/index/field_index/index_selector.rs (1)
234-263
: Optional: Factor out common Integer builder compositionBoth index construction and builder paths compose optional lookup/range components. Consider extracting a small helper to reduce duplication and keep behavior changes centralized (e.g., a function returning
Vec<FieldIndexBuilder>
orOption<Vec<FieldIndex>>>
based on flags).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
lib/segment/src/index/field_index/index_selector.rs
(9 hunks)lib/segment/src/index/struct_payload_index.rs
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- lib/segment/src/index/struct_payload_index.rs
🧰 Additional context used
🧬 Code Graph Analysis (1)
lib/segment/src/index/field_index/index_selector.rs (5)
lib/segment/src/index/field_index/numeric_index/mod.rs (4)
index
(148-164)new_rocksdb
(181-195)new_rocksdb
(530-544)is_on_disk
(472-478)lib/segment/src/index/field_index/map_index/mod.rs (2)
new_rocksdb
(112-118)is_on_disk
(498-504)lib/segment/src/index/field_index/numeric_index/mmap_numeric_index.rs (2)
is_on_disk
(436-438)open
(162-199)lib/segment/src/index/field_index/field_index_base.rs (1)
is_on_disk
(432-446)lib/segment/src/index/field_index/null_index/mutable_null_index.rs (1)
open
(48-65)
🔇 Additional comments (5)
lib/segment/src/index/field_index/index_selector.rs (5)
83-85
: LGTM: Numeric index paths correctly propagate absence via OptionThe early-return with
Ok(self.numeric_new(...)? .map(...))
is clean and consistent with the new “open-and-load” semantics. This ensures we don’t synthesize indices when on-disk state is absent (or creation is disallowed).Also applies to: 101-103, 111-113, 194-197, 206-209
359-376
: LGTM: Numeric index constructor now returns Option and wires create_if_missingThe generic
numeric_new
correctly matches the backend constructors and propagatesOption
to reflect absent on-disk state. Passingcreate_if_missing
through for RocksDB and Gridstore is consistent with the new single-stage open+load behavior.Also applies to: 377-383
394-414
: LGTM: numeric_builder now fallible and consistent across backendsWrapping builder creation in
OperationResult
is appropriate (especially for RocksDB’s fallible builder). The signature and error propagation look good.
467-474
: LGTM: new_null_index always returns a NullIndexThis aligns with
MutableNullIndex::open
semantics (it returns a handle with empty storage when nothing exists andcreate_if_missing
is false). This change simplifies callers and is consistent with “index handle exists; may or may not be on disk yet.”
210-212
: No UUID variant mismatch: UuidMapIndex is the intended index.The
FieldIndexBuilder::UuidIndex
variant is actually aMapIndexBuilder<UuidIntType>
, and both the builder (map_builder
) and the loader (new_index
) consistently produce and loadFieldIndex::UuidMapIndex
. No change is required here.
* In struct payload index open, use option to specify loaded, set rebuild * Open and load mutable numeric index in single stage * Open and load immutable numeric index in single stage * Open and load mmap numeric index in single stage * new_null_index always returns null index Missing refactor for MutableNullIndex. Since we have the `create_if_missing` flag, we no longer have `open_if_exists` function, ergo, signature for `new_null_index` should not include an `Option`. --------- Co-authored-by: Luis Cossío <luis.cossio@outlook.com>
A stack of PRs with the goal of simplifying our payload index internals.
Rather than having a two stage approach - open and load a payload index - this changes to a single stage approach. This now opens and loads payload indices in a single function call.
By using a single stage, we eliminate a bunch of intermediate states we had to keep before. It required explicit handling of this using
Option
s in all payload indices, which is error prone. This now allow us to remove the intermediate variants, simplifying state handling across all our payload indices. It now follows the RAII philosophy.The full stack of PRs kept all logic in payload indices the same, except that it merged opening and loading into a single function. No changes have been made in how we interpret our data structures nor how we respond to index queries.
Further in this PR stack we have two PRs that remove a lot of lines as a result of this:
Open questions
Result<Option<Index>>
whereNone
is used if files don't exist, or create a special enum?I think we can use
Result<Option<Index>>
now and revisit this once all PRs are merged.All Submissions:
dev
branch. Did you create your branch fromdev
?New Feature Submissions:
cargo +nightly fmt --all
command prior to submission?cargo clippy --all --all-features
command?